Transporters and ion channels

ABSTRACT

The invention provides human transporters and ion channels (TRICH) and polynucleotides which identify and encode TRICH. The invention also provides expression vectors, host cells, antibodies, agonists, and antagonists. The invention also provides methods for diagnosing, treating, or preventing disorders associated with aberrant expression of TRICH.

TECHNICAL FIELD

[0001] This invention relates to nucleic acid and amino acid sequencesof transporters and ion channels and to the use of these sequences inthe diagnosis, treatment, and prevention of transport, neurological,muscle, immunological and cell proliferative disorders, and in theassessment of the effects of exogenous compounds on the expression ofnucleic acid and amino acid sequences of transporters and ion channels.

BACKGROUND OF THE INVENTION

[0002] Eukaryotic cells are surrounded and subdivided into functionallydistinct organelles by hydrophobic lipid bilayer membranes which arehighly impermeable to most polar molecules. Cells and organelles requiretransport proteins to import and export essential nutrients and metalions including K⁺, NH₄ ⁺, P_(i), SO₄ ²⁻, sugars, and vitamins, as wellas various metabolic waste products. Transport proteins also play rolesin antibiotic resistance, toxin secretion, ion balance, synapticneurotransmission, kidney function, intestinal absorption, tumor growth,and other diverse cell functions (Griffith, J. and C. Sansom (1998) TheTransporter Facts Book, Academic Press, San Diego Calif., pp. 3-29).Transport can occur by a passive concentration-dependent mechanism, orcan be linked to an energy source such as ATP hydrolysis or an iongradient. Proteins that function in transport include carrier proteins,which bind to a specific solute and undergo a conformational change thattranslocates the bound solute across the membrane, and channel proteins,which form hydrophilic pores that allow specific solutes to diffusethrough the membrane down an electrochemical solute gradient.

[0003] Carrier proteins which transport a single solute from one side ofthe membrane to the other are called uniporters. In contrast, coupledtransporters link the transfer of one solute with simultaneous orsequential transfer of a second solute, either in the same direction(symport) or in the opposite direction (antiport). For example,intestinal and kidney epithelium contains a variety of symporter systemsdriven by the sodium gradient that exists across the plasma membrane.Sodium moves into the cell down its electrochemical gradient and bringsthe solute into the cell with it. The sodium gradient that provides thedriving force for solute uptake is maintained by the ubiquitous Na⁺/K⁺ATPase system. Sodium-coupled transporters include the mammalian glucosetransporter (SGLT1), iodide transporter (NIS), and multivitamintransporter (SMVT). All three transporters have twelve putativetransmembrane segments, extracellular glycosylation sites, andcytoplasmically-oriented N- and C-termini. NIS plays a crucial role inthe evaluation, diagnosis, and treatment of various thyroid pathologiesbecause it is the molecular basis for radioiodide thyroid-imagingtechniques and for specific targeting of radioisotopes to the thyroidgland (Levy, O. et al. (1997) Proc. Natl. Acad. Sci. USA 94:5568-5573).SMVT is expressed in the intestinal mucosa, kidney, and placenta, and isimplicated in the transport of the water-soluble vitamins, e.g., biotinand pantothenate (Prasad, P. D. et al. (1998) J. Biol. Chem.273:7501-7506).

[0004] One of the largest families of transporters is the majorfacilitator superfamily (MFS), also called theuniporter-symporter-antiporter family. MFS transporters are singlepolypeptide carriers that transport small solutes in response to iongradients. Members of the MFS are found in all classes of livingorganisms, and include transporters for sugars, oligosaccharides,phosphates, nitrates, nucleosides, monocarboxylates, and drugs. MFStransporters found in eukaryotes all have a structure comprising 12transmembrane segments (Pao, S. S. et al. (1998) Microbiol. Molec. Biol.Rev. 62:1-34). The largest family of MFS transporters is the sugartransporter family, which includes the seven glucose transporters(GLUT1-GLUT7) found in humans that are required for the transport ofglucose and other hexose sugars. These glucose transport proteins haveunique tissue distributions and physiological functions. GLUT1 providesmany cell types with their basal glucose requirements and transportsglucose across epithelial and endothelial barrier tissues; GLUT2facilitates glucose uptake or efflux from the liver; GLUT3 regulatesglucose supply to neurons; GLUT4 is responsible for insulin-regulatedglucose disposal; and GLUT5 regulates fructose uptake into skeletalmuscle. Defects in glucose transporters are involved in a recentlyidentified neurological syndrome causing infantile seizures anddevelopmental delay, as well as glycogen storage disease, Fanconi-Bickelsyndrome, and non-insulin-dependent diabetes mellitus (Mueckler, M.(1994) Eur. J. Biochem. 219:713-725; Longo, N. and L. J. Elsas (1998)Adv. Pediatr. 45:293-313).

[0005] Monocarboxylate anion transporters are proton-coupled symporterswith a broad substrate specificity that includes L-lactate, pyruvate,and the ketone bodies acetate, acetoacetate, and beta-hydroxybutyrate.At least seven isoforms have been identified to date. The isoforms arepredicted to have twelve transmembrane (TM) helical domains with a largeintracellular loop between TM6 and TM7, and play a critical role inmaintaining intracellular pH by removing the protons that are producedstoichiometrically with lactate during glycolysis. The bestcharacterized H⁺-monocarboxylate transporter is that of the erythrocytemembrane, which transports L-lactate and a wide range of other aliphaticmonocarboxylates. Other cells possess H⁺-linked monocarboxylatetransporters with differing substrate and inhibitor selectivities. Inparticular, cardiac muscle and tumor cells have transporters that differin their K_(m) values for certain substrates, includingstereoselectivity for L- over D-lactate, and in their sensitivity toinhibitors. There are Na⁺-monocarboxylate cotransporters on the luminalsurface of intestinal and kidney epithelia, which allow the uptake oflactate, pyruvate, and ketone bodies in these tissues. In addition,there are specific and selective transporters for organic cations andorganic anions in organs including the kidney, intestine and liver.Organic anion transporters are selective for hydrophobic, chargedmolecules with electron-attracting side groups. Organic cationtransporters, such as the ammonium transporter, mediate the secretion ofa variety of drugs and endogenous metabolites, and contribute to themaintenance of intercellular pH (Poole, R. C. and A. P. Halestrap (1993)Am. J. Physiol. 264:C761-C782; Price, N. T. et al. (1998) Biochem. J.329:321-328; and Martinelle, K. and I. Haggstrom (1993) J. Biotechnol.30:339-350).

[0006] ATP-binding cassette (ABC) transporters are members of asuperfamily of membrane proteins that transport substances ranging fromsmall molecules such as ions, sugars, amino acids, peptides, andphospholipids, to lipopeptides, large proteins, and complex hydrophobicdrugs. ABC transporters consist of four modules: two nucleotide-bindingdomains (NBD), which hydrolyze ATP to supply the energy required fortransport, and two membrane-spanning domains (MSD), each containing sixputative transmembrane segments. These four modules may be encoded by asingle gene, as is the case for the cystic fibrosis transmembraneregulator (CFTR), or by separate genes. When encoded by separate genes,each gene product contains a single NBD and MSD. These “half-molecules”form homo- and heterodimers, such as Tap1 and Tap2, the endoplasmicreticulum-based major histocompatibility (MHC) peptide transport system.Several genetic diseases are attributed to defects in ABC transporters,such as the following diseases and their corresponding proteins: cysticfibrosis (CFTR, an ion channel), adrenoleukodystrophy(adrenoleukodystrophy protein, ALDP), Zellweger syndrome (peroxisomalmembrane protein-70, PMP70), and hyperinsulinemic hypoglycemia(sulfonylurea receptor, SUR). Overexpression of the multidrug resistance(MDR) protein, another ABC transporter, in human cancer cells makes thecells resistant to a variety of cytotoxic drugs used in chemotherapy(Taglicht, D. and S. Michaelis (1998) Meth. Enzymol. 292:130-162).

[0007] A number of metal ions such as iron, zinc, copper, cobalt,manganese, molybdenum, selenium, nickel, and chromium are important ascofactors for a number of enzymes. For example, copper is involved inhemoglobin synthesis, connective tissue metabolism, and bonedevelopment, by acting as a cofactor in oxidoreductases such assuperoxide dismutase, ferroxidase (ceruloplasmin), and lysyl oxidase.Copper and other metal ions must be provided in the diet, and areabsorbed by transporters in the gastrointestinal tract. Plasma proteinstransport the metal ions to the liver and other target organs, wherespecific transporters move the ions into cells and cellular organellesas needed. Imbalances in metal ion metabolism have been associated witha number of disease states (Danks, D. M. (1986) J. Med. Genet.23:99-106).

[0008] P-type ATPases comprise a class of cation-transportingtransmembrane proteins. They are integral membrane proteins which use anaspartyl phosphate intermediate to move cations across a membrane.Features of P-type ATPases include: (i) a cation channel; (ii) a stalk,formed by extensions of the transmembrane α-helices into the cytoplasm;(iii) an ATP binding domain; (iv) a phosphorylated aspartic acid; (v) anadjacent transduction domain; (vi) a phosphatase domain, which removesthe phosphate from the aspartic acid as part of the reaction cycle; and(vii) six or more transmembrane domains. Included in this class areheavy metal-transporting ATPases as well as aminophospholipidtransporters.

[0009] The transport of phosphatidylserine and phosphatidylethanolamineby aminophospholipid translocase results in the movement of thesemolecules from one side of a bilayer to another. This transport isconducted by a newly identified subfamily of P-type ATPases which areproposed to be amphipath transporters. Amphipath transporters movemolecules having both a hydrophilic and a hydrophobic region. As many asseventeen different genes belong to this P-type ATPases subfamily, beinggrouped into several distinct classes and subclasses (Halleck, M. S. etal., (1999) Physiol. Genomics 1:139-150; Vulpe, C. et al., (1993) Nat.Genet. 3:7-13).

[0010] Transport of fatty acids across the plasma membrane can occur bydiffusion, a high capacity, low affinity process. However, under normalphysiological conditions a significant fraction of fatty acid transportappears to occur via a high affinity, low capacity protein-mediatedtransport process. Fatty acid transport protein (FATP), an integralmembrane protein with four transmembrane segments, is expressed intissues exhibiting high levels of plasma membrane fatty acid flux, suchas muscle, heart, and adipose. Expression of FATP is upregulated in3T3-L1 cells during adipose conversion, and expression in COS7fibroblasts elevates uptake of long-chain fatty acids (Hui, T. Y. et al.(1998) J. Biol. Chem. 273:27420-27429).

[0011] The lipocalin superfamily constitutes a phylogeneticallyconserved group of more than forty proteins that function asextracellular ligand-binding proteins which bind and transport smallhydrophobic molecules. Members of this family function as carriers ofretinoids, odorants, chromophores, pheromones, allergens, and sterols,and in a variety of processes including nutrient transport, cell growthregulation, immune response, and prostaglandin synthesis. A subset ofthese proteins may be multifunctional, serving as either a biosyntheticenzyme or as a specific enzyme inhibitor. (Tanaka, T. et al. (1997) J.Biol. Chem. 272:15789-15795; and van't Hof, W. et al. (1997) J. Biol.Chem. 272:1837-1841.)

[0012] Members of the lipocalin family display unusually low levels ofoverall sequence conservation. Pairwise sequence identity often fallsbelow 20%. Sequence similarity between family members is limited toconserved cysteines which form disulfide bonds and three motifs whichform a juxtaposed cluster that functions as a target cell recognitionsite. The lipocalins share an eight stranded, anti-parallel beta-sheetwhich folds back on itself to form a continuously hydrogen-bondedbeta-barrel. The pocket formed by the barrel functions as an internalligand binding site. Seven loops (L1 to L7) form short beta-hairpins,except loop L1 which is a large omega loop that forms a lid to partiallyclose the internal ligand-binding site (Flower (1996) Biochem. J.318:1-14).

[0013] Lipocalins are important transport molecules. Each lipocalinassociates with a particular ligand and delivers that ligand toappropriate target sites within the organism. Retinol-binding protein(RBP), one of the best characterized lipocalins, transports retinol fromstores within the liver to target tissues. Apolipoprotein D (apo D), acomponent of high density lipoproteins (HDLs) and low densitylipoproteins (LDLs), functions in the targeted collection and deliveryof cholesterol throughout the body. Lipocalins are also involved in cellregulatory processes. Apo D, which is identical togross-cystic-disease-fluid protein (GCDFP)-24, is aprogesterone/pregnenolone-binding protein expressed at high levels inbreast cyst fluid. Secretion of apo D in certain human breast cancercell lines is accompanied by reduced cell proliferation and progressionof cells to a more differentiated phenotype. Similarly, apo D andanother lipocalin, α₁-acid glycoprotein (AGP), are involved in nervecell regeneration. AGP is also involved in anti-inflammatory andimmunosuppressive activities. AGP is one of the positive acute-phaseproteins (APP); circulating levels of AGP increase in response to stressand inflammatory stimulation. AGP accumulates at sites of inflammationwhere it inhibits platelet and neutrophil activation and inhibitsphagocytosis. The immunomodulatory properties of AGP are due toglycosylation. AGP is 40% carbohydrate, making it unusually acidic andsoluble. The glycosylation pattern of AGP changes during acute-phaseresponse, and deglycosylated AGP has no immunosuppressive activity(Flower (1994) FEBS Lett. 354:7-11; Flower (1996) supra).

[0014] The lipocalin superfamily also includes several animal allergens,including the mouse major urinary protein (mMUP), the ratα-2-microgloobulin (rA2U), the bovine β-lactoglobulin (βlg), thecockroach allergen (Bla g4), bovine dander allergen (Bos d2), and themajor horse allergen, designated Equus caballus allergen 1 (Equ c1). Equc1 is a powerful allergen responsible for about 80% of anti-horse IgEantibody response in patients who are chronically exposed to horseallergens. It appears that lipocalins may contain a common structurethat is able to induce the IgE response (Gregoire, C. et al., (1996) J.Biol. Chem. 271:32951-32959).

[0015] Lipocalins are used as diagnostic and prognostic markers in avariety of disease states. The plasma level of AGP is monitored duringpregnancy and in diagnosis and prognosis of conditions including cancerchemotherapy, renal disfunction, myocardial infarction, arthritis, andmultiple sclerosis. RBP is used clinically as a marker of tubularreabsorption in the kidney, and apo D is a marker in gross cystic breastdisease (Flower (1996) supra). Additionally, the use of lipocalin animalallergens may help in the diagnosis of allergic reactions to horses(Gregoire supra), pigs, cockroaches, mice and rats.

[0016] Mitochondrial carrier proteins are transmembrane-spanningproteins which transport ions and charged metabolites between thecytosol and the mitochondrial matrix. Examples include the ADP, ATPcarrier protein; the 2-oxoglutarate/malate carrier; the phosphatecarrier protein; the pyruvate carrier; the dicarboxylate carrier whichtransports malate, succinate, fumarate, and phosphate; thetricarboxylate carrier which transports citrate and malate; and theGrave's disease carrier protein, a protein recognized by IgG in patientswith active Grave's disease, an autoimmune disorder resulting inhyperthyroidism. Proteins in this family consist of three tandem repeatsof an approximately 100 amino acid domain, each of which contains twotransmembrane regions (Stryer, L. (1995) Biochemistry, W. H. Freeman andCompany, New York N.Y., p. 551; PROSITE PDOC00189 Mitochondrial energytransfer proteins signature; Online Mendelian Inheritance in Man (OMIM)*275000 Graves Disease).

[0017] This class of transporters also includes the mitochondrialuncoupling proteins, which create proton leaks across the innermitochondrial membrane, thus uncoupling oxidative phosphorylation fromATP synthesis. The result is energy dissipation in the form of heat.Mitochondrial uncoupling proteins have been implicated as modulators ofthermoregulation and metabolic rate, and have been proposed as potentialtargets for drugs against metabolic diseases such as obesity (Ricquier,D. et al. (1999) J. Int. Med. 245:637-642).

[0018] Ion Channels

[0019] The electrical potential of a cell is generated and maintained bycontrolling the movement of ions across the plasma membrane. Themovement of ions requires ion channels, which form ion-selective poreswithin the membrane. There are two basic types of ion channels, iontransporters and gated ion channels. Ion transporters utilize the energyobtained from ATP hydrolysis to actively transport an ion against theion's concentration gradient. Gated ion channels allow passive flow ofan ion down the ion's electrochemical gradient under restrictedconditions. Together, these types of ion channels generate, maintain,and utilize an electrochemical gradient that is used in 1) electricalimpulse conduction down the axon of a nerve cell, 2) transport ofmolecules into cells against concentration gradients, 3) initiation ofmuscle contraction, and 4) endocrine cell secretion.

[0020] Ion Transporters

[0021] Ion transporters generate and maintain the resting electricalpotential of a cell. Utilizing the energy derived from ATP hydrolysis,they transport ions against the ion's concentration gradient. Thesetransmembrane ATPases are divided into three families. Thephosphorylated (P) class ion transporters, including Na⁺-K⁺ ATPase,Ca²⁺-ATPase, and H⁺-ATPase, are activated by a phosphorylation event.P-class ion transporters are responsible for maintaining restingpotential distributions such that cytosolic concentrations of Na⁺ andCa²⁺ are low and cytosolic concentration of K⁺ is high. The vacuolar (V)class of ion transporters includes H⁺ pumps on intracellular organelles,such as lysosomes and Golgi. V-class ion transporters are responsiblefor generating the low pH within the lumen of these organelles that isrequired for function. The coupling factor (F) class consists of H⁺pumps in the mitochondria. F-class ion transporters utilize a protongradient to generate ATP from ADP and inorganic phosphate (P_(i)).

[0022] The P-ATPases are hexamers of a 100 kD subunit with tentransmembrane domains and several large cytoplasmic regions that mayplay a role in ion binding (Scarborough, G. A. (1999) Curr. Opin. CellBiol. 11:517-522). P-type ATPases use an aspartyl phosphate intermediateto move cations across a membrane. Features of P-type ATPases include:(i) a cation channel; (ii) a stalk, formed by extensions of thetransmembrane α-helices into the cytoplasm; (iii) an ATP binding domain;(iv) a phosphorylated aspartic acid; (v) an adjacent transductiondomain; (vi) a phosphatase domain, which removes the phosphate from theaspartic acid as part of the reaction cycle; and (vii) six or moretransmembrane domains. Included in this class are heavymetal-transporting ATPases as well as aminophospholipid transporters.The FIC1 gene encodes a P-type ATPase that is mutated in two forms ofhereditary cholestasis. The protein product of FIC1 is likely to play anessential role in bile acid circulation in the liver (Bull, L. N. et al.(1998) Nat. Genet. 18:219-224). The V-ATPases are composed of twofunctional domains: the V₁ domain, a peripheral complex responsible forATP hydrolysis; and the V₀ domain, an integral complex responsible forproton translocation across the membrane. The F-ATPases are structurallyand evolutionarily related to the V-ATPases. The F-ATPase F₀ domaincontains 12 copies of the c subunit, a highly hydrophobic proteincomposed of two transmembrane domains and containing a single buriedcarboxyl group in TM2 that is essential for proton transport. TheV-ATPase V₀ domain contains three types of homologous c subunits withfour or five transmembrane domains and the essential carboxyl group inTM4 or TM3. Both types of complex also contain a single a subunit thatmay be involved in regulating the pH dependence of activity (Forgac, M.(1999) J. Biol. Chem. 274:12951-12954).

[0023] The resting potential of the cell is utilized in many processesinvolving carrier proteins and gated ion channels. Carrier proteinsutilize the resting potential to transport molecules into and out of thecell. Amino acid and glucose transport into many cells is linked tosodium ion co-transport (symport) so that the movement of Na⁺ down anelectrochemical gradient drives transport of the other molecule up aconcentration gradient. Similarly, cardiac muscle links transfer of Ca²⁺out of the cell with transport of Na⁺ into the cell (antiport).

[0024] Gated Ion Channels

[0025] Gated ion channels control ion flow by regulating the opening andclosing of pores. The ability to control ion flux through various gatingmechanisms allows ion channels to mediate such diverse signaling andhomeostatic functions as neuronal and endocrine signaling, musclecontraction, fertilization, and regulation of ion and pH balance. Gatedion channels are categorized according to the manner of regulating thegating function. Mechanically-gated channels open their pores inresponse to mechanical stress; voltage-gated channels (e.g., Na⁺, K⁺,Ca²⁺, and Cl⁻ channels) open their pores in response to changes inmembrane potential; and ligand-gated channels (e.g., acetylcholine-,serotonin-, and glutamate-gated cation channels, and GABA- andglycine-gated chloride channels) open their pores in the presence of aspecific ion, nucleotide, or neurotransmitter. The gating properties ofa particular ion channel (i.e., its threshold for and duration ofopening and closing) are sometimes modulated by association withauxiliary channel proteins and/or post translational modifications, suchas phosphorylation.

[0026] Mechanically-gated or mechanosensitive ion channels act astransducers for the senses of touch, hearing, and balance, and also playimportant roles in cell volume regulation, smooth muscle contraction,and cardiac rhythm generation. A stretch-inactivated channel (SIC) wasrecently cloned from rat kidney. The SIC channel belongs to a group ofchannels which are activated by pressure or stress on the cell membraneand conduct both Ca²⁺ and Na⁺ (Suzuki, M. et al. (1999) J. Biol. Chem.274:6330-6335).

[0027] The pore-forming subunits of the voltage-gated cation channelsform a superfamily of ion channel proteins. The characteristic domain ofthese channel proteins comprises six transmembrane domains (S1-S6), apore-forming region (P) located between S5 and S6, and intracellularamino and carboxy termini. In the Na⁺ and Ca²⁺ subfamilies, this domainis repeated four times, while in the K⁺ channel subfamily, each channelis formed from a tetramer of either identical or dissimilar subunits.The P region contains information specifying the ion selectivity for thechannel. In the case of K⁺ channels, a GYG tripeptide is involved inthis selectivity (Ishii, T. M. et al. (1997) Proc. Natl. Acad. Sci. USA94:11651-11656).

[0028] Voltage-gated Na⁺ and K⁺ channels are necessary for the functionof electrically excitable cells, such as nerve and muscle cells. Actionpotentials, which lead to neurotransmitter release and musclecontraction, arise from large, transient changes in the permeability ofthe membrane to Na⁺ and K⁺ ions. Depolarization of the membrane beyondthe threshold level opens voltage-gated Na⁺ channels. Sodium ions flowinto the cell, further depolarizing the membrane and opening morevoltage-gated Na⁺ channels, which propagates the depolarization down thelength of the cell. Depolarization also opens voltage-gated potassiumchannels. Consequently, potassium ions flow outward, which leads torepolarization of the membrane. Voltage-gated channels utilize chargedresidues in the fourth transmembrane segment (S4) to sense voltagechange. The open state lasts only about 1 millisecond, at which time thechannel spontaneously converts into an inactive state that cannot beopened irrespective of the membrane potential. Inactivation is mediatedby the channel's N-terminus, which acts as a plug that closes the pore.The transition from an inactive to a closed state requires a return toresting potential.

[0029] Voltage-gated Na⁺ channels are heterotrimeric complexes composedof a 260 kDa pore-forming α subunit that associates with two smallerauxiliary subunits, β1 and β2. The β2 subunit is a integral membraneglycoprotein that contains an extracellular Ig domain, and itsassociation with α and β1 subunits correlates with increased functionalexpression of the channel, a change in its gating properties, as well asan increase in whole cell capacitance due to an increase in membranesurface area (Isom, L. L. et al. (1995) Cell 83:433-442).

[0030] Non voltage-gated Na⁺ channels include the members of theamiloride-sensitive Na⁺ channel/degenerin (NaC/DEG) family. Channelsubunits of this family are thought to consist of two transmembranedomains flanking a long extracellular loop, with the amino and carboxyltermini located within the cell. The NaC/DEG family includes theepithelial Na⁺ channel (ENaC) involved in Na⁺ reabsorption in epitheliaincluding the airway, distal colon, cortical collecting duct of thekidney, and exocrine duct glands. Mutations in ENaC result inpseudohypoaldosteronism type 1 and Liddle's syndrome(pseudohyperaldosteronism). The NaC/DEG family also includes therecently characterized H⁺-gated cation channels or acid-sensing ionchannels (ASIC). ASIC subunits are expressed in the brain and formheteromultimeric Na⁺-permeable channels. These channels require acid pHfluctuations for activation. ASIC subunits show homology to thedegenerins, a family of mechanically-gated channels originally isolatedfrom C. elegans. Mutations in the degenerins cause neurodegeneration.ASIC subunits may also have a role in neuronal function, or in painperception, since tissue acidosis causes pain (Waldmann, R. and M.Lazdunski (1998) Curr. Opin. Neurobiol. 8:418-424; Eglen, R. M. et al.(1999) Trends Pharmacol. Sci. 20:337-342).

[0031] K⁺ channels are located in all cell types, and may be regulatedby voltage, ATP concentration, or second messengers such as Ca²⁺ andcAMP. In non-excitable tissue, K⁺ channels are involved in proteinsynthesis, control of endocrine secretions, and the maintenance ofosmotic equilibrium across membranes. In neurons and other excitablecells, in addition to regulating action potentials and repolarizingmembranes, K⁺ channels are responsible for setting the resting membranepotential. The cytosol contains non-diffusible anions and, to balancethis net negative charge, the cell contains a Na⁺-K⁺ pump and ionchannels that provide the redistribution of Na⁺, K⁺, and Cl⁻. The pumpactively transports Na⁺ out of the cell and K⁺ into the cell in a 3:2ratio. Ion channels in the plasma membrane allow K⁺ and Cl⁻ to flow bypassive diffusion. Because of the high negative charge within thecytosol, Cl⁻ flows out of the cell. The flow of K⁺ is balanced by anelectromotive force pulling K⁺ into the cell, and a K⁺ concentrationgradient pushing K⁺ out of the cell. Thus, the resting membranepotential is primarily regulated by K⁺ flow (Salkoff, L. and T. Jegla(1995) Neuron 15:489-492).

[0032] Potassium channel subunits of the Shaker-like superfamily allhave the characteristic six transmembrane/1 pore domain structure. Foursubunits combine as homo- or heterotetramers to form functional Kchannels. These pore-forming subunits also associate with variouscytoplasmic β subunits that alter channel inactivation kinetics. TheShaker-like channel family includes the voltage-gated K⁺ channels aswell as the delayed rectifier type channels such as the humanether-a-go-go related gene (HERG) associated with long QT, a cardiacdysrythmia syndrome (Curran, M. E. (1998) Curr. Opin. Biotechnol.9:565-572; Kaczorowski, G. J. and M. L. Garcia (1999) Curr. Opin. Chem.Biol. 3:448-458).

[0033] A second superfamily of K⁺ channels is composed of the inwardrectifying channels (Kir). Kir channels have the property ofpreferentially conducting K⁺ currents in the inward direction. Theseproteins consist of a single potassium selective pore domain and twotransmembrane domains, which correspond to the fifth and sixthtransmembrane domains of voltage-gated K⁺ channels. Kir subunits alsoassociate as tetramers. The Kir family includes ROMK1, mutations inwhich lead to Bartter syndrome, a renal tubular disorder. Kir channelsare also involved in regulation of cardiac pacemaker activity, seizuresand epilepsy, and insulin regulation (Doupnik, C. A. et al. (1995) Curr.Opin. Neurobiol. 5:268-277; Curran, supra).

[0034] The recently recognized TWIK K⁺ channel family includes themammalian TWIK-1, TREK-1 and TASK proteins. Members of this familypossess an overall structure with four transmembrane domains and two Pdomains. These proteins are probably involved in controlling the restingpotential in a large set of cell types (Duprat, F. et al. (1997) EMBO J16:5464-5471).

[0035] The voltage-gated Ca²⁺ channels have been classified into severalsubtypes based upon their electrophysiological and pharmacologicalcharacteristics. L-type Ca²⁺ channels are predominantly expressed inheart and skeletal muscle where they play an essential role inexcitation-contraction coupling. T-type channels are important forcardiac pacemaker activity, while N-type and P/Q-type channels areinvolved in the control of neurotransmitter release in the central andperipheral nervous system. The L-type and N-type voltage-gated Ca²⁺channels have been purified and, though their functions differdramatically, they have similar subunit compositions. The channels arecomposed of three subunits. The α₁ subunit forms the membrane pore andvoltage sensor, while the α₂δ, and β subunits modulate thevoltage-dependence, gating properties, and the current amplitude of thechannel. These subunits are encoded by at least six α₁, one α₂δ, andfour β genes. A fourth subunit, γ, has been identified in skeletalmuscle (Walker, D. et al. (1998) J. Biol. Chem. 273:2361-2367;McCleskey, E. W. (1994) Curr. Opin. Neurobiol. 4:304-312).

[0036] The high-voltage-activated Ca²⁺ channels that have beencharacterized biochemically include complexes of a pore-forming alpha1subunit of approximately 190-250 kDa; a transmembrane complex of alpha2and delta subunits; an intracellular beta subunit; and in some cases atransmembrane gamma subunit. A variety of alpha1 subunits, alpha2deltacomplexes, beta subunits, and gamma subunits are known. The Cav1 familyof alpha1 subunits conduct L-type Ca²⁺ currents, which initiate musclecontraction, endocrine secretion, and gene transcription, and areregulated primarily by second messenger-activated proteinphosphorylation pathways. The Cav2 family of alpha1 subunits conductN-type, P/Q-type, and R-type Ca²⁺ currents, which initiate rapidsynaptic transmission and are regulated primarily by direct interactionwith G proteins and SNARE proteins and secondarily by proteinphosphorylation. The Cav3 family of alpha1 subunits conduct T-type Ca²⁺currents, which are activated and inactivated more rapidly and at morenegative membrane potentials than other Ca²⁺ current types. The distinctstructures and patterns of regulation of these three families of Ca²⁺channels provide an array of Ca²⁺ entry pathways in response to changesin membrane potential and a range of possibilities for regulation ofCa²⁺ entry by second messenger pathways and interacting proteins(Catterall, W. A. (2000) Annu. Rev. Cell Dev. Biol. 16:521-555).

[0037] The alpha-2 subunit of the voltage-gated Ca²⁺-channel may includeone or more Cache domains. An extracellular Cache domain may be fused toan intracellular catalytic domain, such as the histidine kinase, PP2Cphosphatase, GGDEF (a predicted diguanylate cyclase), HD-GYP (apredicted phosphodiesterase) or adenylyl cyclase domain, or to anoncatalytic domain, like the methyl-accepting, DNA-binding wingedhelix-turn-helix, GAF, PAS or HAMP (a domain found in istidine kinases,denylyl cyclases, ethyl-binding proteins and phosphatases). Smallmolecules are bound via the Cache domain and this signal is convertedinto diverse outputs depending on the intracellular domains(Anantharaman, V. and Aravind, L. (2000) Trends Biochem. Sci.25:535-537).

[0038] The transient receptor family (Trp) of calcium ion channels arethought to mediate capacitative calcium entry (CCE). CCE is the Ca²⁺influx into cells to resupply Ca²⁺ stores depleted by the action ofinositol triphosphate (IP3) and other agents in response to numeroushormones and growth factors. Trp and Trp-like were first cloned fromDrosophila and have similarity to voltage gated Ca²⁺ channels in the S3through S6 regions. This suggests that Trp and/or related proteins mayform mammalian CCE channels (Zhu, X. et al. (1996) Cell 85:661-671;Boulay, G. et al. (1997) J. Biol. Chem. 272:29672-29680). Melastatin isa gene isolated in both the mouse and human, whose expression inmelanoma cells is inversely correlated with melanoma aggressiveness invivo. The human cDNA transcript corresponds to a 1533-amino acid proteinhaving homology to members of the Trp family. It has been proposed thatthe combined use of malastatin mRNA expression status and tumorthickness might allow for the determination of subgroups of patients atboth low and high risk for developing metastatic disease (Duncan, L. M.et al (2001) J. Clin. Oncol. 19:568-576).

[0039] Chloride channels are necessary in endocrine secretion and inregulation of cytosolic and organelle pH. In secretory epithelial cells,Cl⁻ enters the cell across a basolateral membrane through an Na +,K⁺/Cl⁻ cotransporter, accumulating in the cell above its electrochemicalequilibrium concentration. Secretion of Cl⁻ from the apical surface, inresponse to hormonal stimulation, leads to flow of Na⁺ and water intothe secretory lumen. The cystic fibrosis transmembrane conductanceregulator (CFTR) is a chloride channel encoded by the gene for cysticfibrosis, a common fatal genetic disorder in humans. CFTR is a member ofthe ABC transporter family, and is composed of two domains eachconsisting of six transmembrane domains followed by a nucleotide-bindingsite. Loss of CFTR function decreases transepithelial water secretionand, as a result, the layers of mucus that coat the respiratory tree,pancreatic ducts, and intestine are dehydrated and difficult to clear.The resulting blockage of these sites leads to pancreatic insufficiency,“meconium ileus”, and devastating “chronic obstructive pulmonarydisease” (Al-Awqati, Q. et al. (1992) J. Exp. Biol. 172:245-266).

[0040] The voltage-gated chloride channels (CLC) are characterized by10-12 transmembrane domains, as well as two small globular domains knownas CBS domains. The CLC subunits probably function as homotetramers. CLCproteins are involved in regulation of cell volume, membrane potentialstabilization, signal transduction, and transepithelial transport.Mutations in CLC-1, expressed predominantly in skeletal muscle, areresponsible for autosomal recessive generalized myotonia and autosomaldominant myotonia congenita, while mutations in the kidney channel CLC-5lead to kidney stones (Jentsch, T. J. (1996) Cuff. Opin. Neurobiol.6:303-310).

[0041] Ligand-gated channels open their pores when an extracellular orintracellular mediator binds to the channel. Neurotransmitter-gatedchannels are channels that open when a neurotransmitter binds to theirextracellular domain. These channels exist in the postsynaptic membraneof nerve or muscle cells. There are two types of neurotransmitter-gatedchannels. Sodium channels open in response to excitatoryneurotransmitters, such as acetylcholine, glutamate, and serotonin. Thisopening causes an influx of Na⁺ and produces the initial localizeddepolarization that activates the voltage-gated channels and starts theaction potential. Chloride channels open in response to inhibitoryneurotransmitters, such as γ-aminobutyric acid (GABA) and glycine,leading to hyperpolarization of the membrane and the subsequentgeneration of an action potential. Neurotransmitter-gated ion channelshave four transmembrane domains and probably function as pentamers(Jentsch, supra). Amino acids in the second transmembrane domain appearto be important in determining channel permeation and selectivity(Sather, W. A. et al. (1994) Curr. Opin. Neurobiol. 4:313-323).

[0042] Ligand-gated channels can be regulated by intracellular secondmessengers. For example, calcium-activated K⁺ channels are gated byinternal calcium ions. In nerve cells, an influx of calcium duringdepolarization opens K⁺ channels to modulate the magnitude of the actionpotential (Ishi et al., supra). The large conductance (BK) channel hasbeen purified from brain and its subunit composition determined. The αsubunit of the BK channel has seven rather than six transmembranedomains in contrast to voltage-gated K⁺ channels. The extratransmembrane domain is located at the subunit N-terminus. A28-amino-acid stretch in the C-terminal region of the subunit (the“calcium bowl” region) contains many negatively charged residues and isthought to be the region responsible for calcium binding. The β subunitconsists of two transmembrane domains connected by a glycosylatedextracellular loop, with intracellular N- and C-termini (Kaczorowski,supra; Vergara, C. et al. (1998) Curr. Opin. Neurobiol. 8:321-329).

[0043] Cyclic nucleotide-gated (CNG) channels are gated by cytosoliccyclic nucleotides. The best examples of these are the cAMP-gated Na⁺channels involved in olfaction and the cGMP-gated cation channelsinvolved in vision. Both systems involve ligand-mediated activation of aG-protein coupled receptor which then alters the level of cyclicnucleotide within the cell. CNG channels also represent a major pathwayfor Ca²⁺ entry into neurons, and play roles in neuronal development andplasticity. CNG channels are tetramers containing at least two types ofsubunits, an α subunit which can form functional homomeric channels, anda β subunit, which modulates the channel properties. All CNG subunitshave six transmembrane domains and a pore forming region between thefifth and sixth transmembrane domains, similar to voltage-gated K⁺channels. A large C-terminal domain contains a cyclic nucleotide bindingdomain, while the N-terminal domain confers variation among channelsubtypes (Zufall, F. et al. (1997) Curr. Opin. Neurobiol. 7:404-412).

[0044] The activity of other types of ion channel proteins may also bemodulated by a variety of intracellular signalling proteins. Manychannels have sites for phosphorylation by one or more protein kinasesincluding protein kinase A, protein kinase C, tyrosine kinase, andcasein kinase II, all of which regulate ion channel activity in cells.Kir channels are activated by the binding of the Gβγ subunits ofheterotrimeric G-proteins (Reimann, F. and F. M. Ashcroft (1999) Curr.Opin. Cell. Biol. 11:503-508). Other proteins are involved in thelocalization of ion channels to specific sites in the cell membrane.Such proteins include the PDZ domain proteins known as MAGUKs(membrane-associated guanylate kinases) which regulate the clustering ofion channels at neuronal synapses (Craven, S. E. and D. S. Bredt (1998)Cell 93:495-498).

[0045] Disease Correlation

[0046] The etiology of numerous human diseases and disorders can beattributed to defects in the transport of molecules across membranes.Defects in the trafficking of membrane-bound transporters and ionchannels are associated with several disorders, e.g., cystic fibrosis,glucose-galactose malabsorption syndrome, hypercholesterolemia, vonGierke disease, and certain forms of diabetes mellitus. Single-genedefect diseases resulting in an inability to transport small moleculesacross membranes include, e.g., cystinuria, iminoglycinuria, Hartupdisease, and Fanconi disease (van't Hoff, W. G. (1996) Exp. Nephrol.4:253-262; Talente, G. M. et al. (1994) Ann. Intern. Med. 120:218-226;and Chillon, M. et al. (1995) New Engl. J. Med. 332:1475-1480).

[0047] Human diseases caused by mutations in ion channel genes includedisorders of skeletal muscle, cardiac muscle, and the central nervoussystem. Mutations in the pore-forming subunits of sodium and chloridechannels cause myotonia, a muscle disorder in which relaxation aftervoluntary contraction is delayed. Sodium channel myotonias have beentreated with channel blockers. Mutations in muscle sodium and calciumchannels cause forms of periodic paralysis, while mutations in thesarcoplasmic calcium release channel, T-tubule calcium channel, andmuscle sodium channel cause malignant hyperthermia. Cardiac arrythmiadisorders such as the long QT syndromes and idiopathic ventricularfibrillation are caused by mutations in potassium and sodium channels(Cooper, E. C. and L. Y. Jan (1998) Proc. Natl. Acad. Sci. USA96:4759-4766). All four known human idiopathic epilepsy genes code forion channel proteins (Berkovic, S. F. and I. E. Scheffer (1999) Curr.Opin. Neurology 12:177-182). Other neurological disorders such asataxias, hemiplegic migraine and hereditary deafness can also resultfrom mutations in ion channel genes (Jen, J. (1999) Curr. Opin.Neurobiol. 9:274-280; Cooper, supra).

[0048] Ion channels have been the target for many drug therapies.Neurotransmitter-gated channels have been targeted in therapies fortreatment of insomnia, anxiety, depression, and schizophrenia.Voltage-gated channels have been targeted in therapies for arrhythmia,ischemic stroke, head trauma, and neurodegenerative disease (Taylor, C.P. and L. S. Narasimhan (1997) Adv. Pharmacol. 39:47-98). Variousclasses of ion channels also play an important role in the perception ofpain, and thus are potential targets for new analgesics. These includethe vanilloid-gated ion channels, which are activated by the vanilloidcapsaicin, as well as by noxious heat. Local anesthetics such aslidocaine and mexiletine which blockade voltage-gated Na⁺ channels havebeen useful in the treatment of neuropathic pain (Eglen, supra).

[0049] Ion channels in the immune system have recently been suggested astargets for immunomodulation. T-cell activation depends upon calciumsignaling, and a diverse set of T-cell specific ion channels has beencharacterized that affect this signaling process. Channel blockingagents can inhibit secretion of lymphokines, cell proliferation, andkilling of target cells. A peptide antagonist of the T-cell potassiumchannel Kv1.3 was found to suppress delayed-type hypersensitivity andallogenic responses in pigs, validating the idea of channel blockers assafe and efficacious immunosuppressants (Cahalan, M. D. and K. G. Chandy(1997) Curr. Opin. Biotechnol. 8:749-756).

[0050] Senescence

[0051] Most normal eukaryotic cells, after a certain number ofdivisions, enter a state of senescence in which cells remain viable andmetabolically active but no longer replicate. A number of phenotypicchanges such as increased cell size and pH-dependent beta-galactosidaseactivity, and molecular changes such as the upregulation of particulargenes, occur in senescent cells (Shelton (1999) Current Biology9:939-945). When senescent cells are exposed to mitogens, a number ofgenes are upregulated, but the cells do not proliferate. Evidenceindicates that senescent cells accumulate with age in vivo, contributingto the aging of an organism. In addition, senescence suppressestumorigenesis, and many genes necessary for senescence also function astumor suppressor genes, such as p53 and the retinoblastomasusceptibility gene. Most tumors contain cells that have surpassed theirreplicative limit, i.e. they are immortalized. Many oncogenesimmortalize cells as a first step toward tumor formation.

[0052] A variety of challenges, such as oxidative stress, radiation,activated oncoproteins, and cell cycle inhibitors, induce a senescentphenotype, indicating that senescence is influenced by a number ofproliferative and anti-proliferative signals (Shelton supra). Senescenceis correlated with the progressive shortening of telomeres that occurswith each cell division. Expression of the catalytic component oftelomerase in cells prevents telomere shortening and immortalizes cellssuch as fibroblasts and epithelial cells, but not other types of cells,such as CD8+ T cells (Migliaccio et al. (2000) J. Immunol.165:4978-4984). Thus, senescence is controlled by telomere shortening aswell as other mechanisms depending on the type of cell.

[0053] A number of genes that are differentially expressed betweensenescent and presenescent cells have been identified as part of ongoingstudies to understand the role of senescence in aging and tumorigenesis.Most senescent cells are growth arrested in the G1 stage of the cellcycle. While expression of many cell cycle genes is similar in senescentand presenescent cells (Cristofalo (1992) Ann. N. Y. Acad. Sci.663:187-194), expression of others genes such as cyclin-dependentkinases p21 and p16, which inhibit proliferation, and cyclins D1 and Eis elevated in senescent cells. Other genes that are not directlyinvolved in the cell cycle are also upregulated such as extracellularmatrix proteins fibronectin, procollagen, and osteonectin; and proteasessuch as collagenase, stromelysin, and cathepsin B (Chen (2000) Ann. N.Y.Acad. Sci. 908:111-125). Genes underexpressed in senescent cells includethose that encode heat shock proteins, c-fos, and cdc-2 (Chen supra).

[0054] P-glycoprotein is a member of the ABC transporter family that isexpressed on cells of the immune system and plays a role in thesecretion of cytokines and cytotoxic molecules. P-glycoproteinexpression and function were found to be increased in aging lymphocytes.These differences may play a role in the changes in immune response,including increased frequency of infections and autoimmune phenomena,associated with human aging (Aggrawal, S. et al. (1997) J. Clin.Immunol. 17:448-454).

[0055] The discovery of new transporters and ion channels, and thepolynucleotides encoding them, satisfies a need in the art by providingnew compositions which are useful in the diagnosis, prevention, andtreatment of transport, neurological, muscle, immunological and cellproliferative disorders, and in the assessment of the effects ofexogenous compounds on the expression of nucleic acid and amino acidsequences of transporters and ion channels.

SUMMARY OF THE INVENTION

[0056] The invention features purified polypeptides, transporters andion channels, referred to collectively as “TRICH” and individually as“TRICH-1,” “TRICH-2,” “TRICH-3,” “TRICH-4,” “TRICH-5,” “TRICH-6,”“TRICH-7,” “TRICH-8,” “TRICH-9,” “TRICH-10,” “TRICH-11,” “TRICH-12,”“TRICH-13,” “TRICH-14,” “TRICH-15,” “TRICH-16,” “TRICH-17,” “TRICH-18,”“TRICH-19,” and “TRICH-20.” In one aspect, the invention provides anisolated polypeptide selected from the group consisting of a) apolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20, b) a polypeptide comprising a naturallyoccurring amino acid sequence at least 90% identical to an amino acidsequence selected from the group consisting of SEQ ID NO:1-20, c) abiologically active fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO:1-20, and d) animmunogenic fragment of a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO:1-20. In onealternative, the invention provides an isolated polypeptide comprisingthe amino acid sequence of SEQ ID NO:1-20.

[0057] The invention further provides an isolated polynucleotideencoding a polypeptide selected from the group consisting of a) apolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20, b) a polypeptide comprising a naturallyoccurring amino acid sequence at least 90% identical to an amino acidsequence selected from the group consisting of SEQ ID NO:1-20, c) abiologically active fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO:1-20, and d) animmunogenic fragment of a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO:1-20. In onealternative, the polynucleotide encodes a polypeptide selected from thegroup consisting of SEQ ID NO:1-20. In another alternative, thepolynucleotide is selected from the group consisting of SEQ ID NO:21-40.

[0058] Additionally, the invention provides a recombinant polynucleotidecomprising a promoter sequence operably linked to a polynucleotideencoding a polypeptide selected from the group consisting of a) apolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20, b) a polypeptide comprising a naturallyoccurring amino acid sequence at least 90% identical to an amino acidsequence selected from the group consisting of SEQ ID NO:1-20, c) abiologically active fragment of a polypeptide having an amino acidsequence selected from the group consisting of SEQ ID NO:1-20, and d) animmunogenic fragment of a polypeptide having an amino acid sequenceselected from the group consisting of SEQ ID NO:1-20. In onealternative, the invention provides a cell transformed with therecombinant polynucleotide. In another alternative, the inventionprovides a transgenic organism comprising the recombinantpolynucleotide.

[0059] The invention also provides a method for producing a polypeptideselected from the group consisting of a) a polypeptide comprising anamino acid sequence selected from the group consisting of SEQ IDNO:1-20, b) a polypeptide comprising a naturally occurring amino acidsequence at least 90% identical to an amino acid sequence selected fromthe group consisting of SEQ ID NO:1-20, c) a biologically activefragment of a polypeptide having an amino acid sequence selected fromthe group consisting of SEQ ID NO:1-20, and d) an immunogenic fragmentof a polypeptide having an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20. The method comprises a) culturing a cellunder conditions suitable for expression of the polypeptide, whereinsaid cell is transformed with a recombinant polynucleotide comprising apromoter sequence operably linked to a polynucleotide encoding thepolypeptide, and b) recovering the polypeptide so expressed.

[0060] Additionally, the invention provides an isolated antibody whichspecifically binds to a polypeptide selected from the group consistingof a) a polypeptide comprising an amino acid sequence selected from thegroup consisting of SEQ ID NO:1-20, b) a polypeptide comprising anaturally occurring amino acid sequence at least 90% identical to anamino acid sequence selected from the group consisting of SEQ IDNO:1-20, c) a biologically active fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ IDNO:1-20, and d) an immunogenic fragment of a polypeptide having an aminoacid sequence selected from the group consisting of SEQ ID NO:1-20.

[0061] The invention further provides an isolated polynucleotideselected from the group consisting of a) a polynucleotide comprising apolynucleotide sequence selected from the group consisting of SEQ IDNO:21-40, b) a polynucleotide comprising a naturally occurringpolynucleotide sequence at least 90% identical to a polynucleotidesequence selected from the group consisting of SEQ ID NO:21-40, c) apolynucleotide complementary to the polynucleotide of a), d) apolynucleotide complementary to the polynucleotide of b), and e) an RNAequivalent of a)-d). In one alternative, the polynucleotide comprises atleast 60 contiguous nucleotides.

[0062] Additionally, the invention provides a method for detecting atarget polynucleotide in a sample, said target polynucleotide having asequence of a polynucleotide selected from the group consisting of a) apolynucleotide comprising a polynucleotide sequence selected from thegroup consisting of SEQ ID NO:21-40, b) a polynucleotide comprising anaturally occurring polynucleotide sequence at least 90% identical to apolynucleotide sequence selected from the group consisting of SEQ IDNO:21-40, c) a polynucleotide complementary to the polynucleotide of a),d) a polynucleotide complementary to the polynucleotide of b), and e) anRNA equivalent of a)-d). The method comprises a) hybridizing the samplewith a probe comprising at least 20 contiguous nucleotides comprising asequence complementary to said target polynucleotide in the sample, andwhich probe specifically hybridizes to said target polynucleotide, underconditions whereby a hybridization complex is formed between said probeand said target polynucleotide or fragments thereof, and b) detectingthe presence or absence of said hybridization complex, and optionally,if present, the amount thereof. In one alternative, the probe comprisesat least 60 contiguous nucleotides.

[0063] The invention further provides a method for detecting a targetpolynucleotide in a sample, said target polynucleotide having a sequenceof a polynucleotide selected from the group consisting of a) apolynucleotide comprising a polynucleotide sequence selected from thegroup consisting of SEQ ID NO:21-40, b) a polynucleotide comprising anaturally occurring polynucleotide sequence at least 90% identical to apolynucleotide sequence selected from the group consisting of SEQ IDNO:21-40, c) a polynucleotide complementary to the polynucleotide of a),d) a polynucleotide complementary to the polynucleotide of b), and e) anRNA equivalent of a)-d). The method comprises a) amplifying said targetpolynucleotide or fragment thereof using polymerase chain reactionamplification, and b) detecting the presence or absence of saidamplified target polynucleotide or fragment thereof, and, optionally, ifpresent, the amount thereof.

[0064] The invention further provides a composition comprising aneffective amount of a polypeptide selected from the group consisting ofa) a polypeptide comprising an amino acid sequence selected from thegroup consisting of SEQ ID NO:1-20, b) a polypeptide comprising anaturally occurring amino acid sequence at least 90% identical to anamino acid sequence selected from the group consisting of SEQ IDNO:1-20, c) a biologically active fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ IDNO:1-20, and d) an immunogenic fragment of a polypeptide having an aminoacid sequence selected from the group consisting of SEQ ID NO:1-20, anda pharmaceutically acceptable excipient. In one embodiment, thecomposition comprises an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20. The invention additionally provides amethod of treating a disease or condition associated with decreasedexpression of functional TRICH, comprising administering to a patient inneed of such treatment the composition.

[0065] The invention also provides a method for screening a compound foreffectiveness as an agonist of a polypeptide selected from the groupconsisting of a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO:1-20, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO:1-20, c) a biologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO:1-20, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ IDNO:1-20. The method comprises a) exposing a sample comprising thepolypeptide to a compound, and b) detecting agonist activity in thesample. In one alternative, the invention provides a compositioncomprising an agonist compound identified by the method and apharmaceutically acceptable excipient. In another alternative, theinvention provides a method of treating a disease or conditionassociated with decreased expression of functional TRICH, comprisingadministering to a patient in need of such treatment the composition.

[0066] Additionally, the invention provides a method for screening acompound for effectiveness as an antagonist of a polypeptide selectedfrom the group consisting of a) a polypeptide comprising an amino acidsequence selected from the group consisting of SEQ ID NO:1-20, b) apolypeptide comprising a naturally occurring amino acid sequence atleast 90% identical to an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20, c) a biologically active fragment of apolypeptide having an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20, and d) an immunogenic fragment of apolypeptide having an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20. The method comprises a) exposing a samplecomprising the polypeptide to a compound, and b) detecting antagonistactivity in the sample. In one alternative, the invention provides acomposition comprising an antagonist compound identified by the methodand a pharmaceutically acceptable excipient. In another alternative, theinvention provides a method of treating a disease or conditionassociated with overexpression of functional TRICH, comprisingadministering to a patient in need of such treatment the composition.

[0067] The invention further provides a method of screening for acompound that specifically binds to a polypeptide selected from thegroup consisting of a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO:1-20, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO:1-20, c) a biologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO:1-20, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ IDNO:1-20. The method comprises a) combining the polypeptide with at leastone test compound under suitable conditions, and b) detecting binding ofthe polypeptide to the test compound, thereby identifying a compoundthat specifically binds to the polypeptide.

[0068] The invention further provides a method of screening for acompound that modulates the activity of a polypeptide selected from thegroup consisting of a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO:1-20, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO:1-20, c) a biologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO:1-20, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ IDNO:1-20. The method comprises a) combining the polypeptide with at leastone test compound under conditions permissive for the activity of thepolypeptide, b) assessing the activity of the polypeptide in thepresence of the test compound, and c) comparing the activity of thepolypeptide in the presence of the test compound with the activity ofthe polypeptide in the absence of the test compound, wherein a change inthe activity of the polypeptide in the presence of the test compound isindicative of a compound that modulates the activity of the polypeptide.

[0069] The invention further provides a method for screening a compoundfor effectiveness in altering expression of a target polynucleotide,wherein said target polynucleotide comprises a polynucleotide sequenceselected from the group consisting of SEQ ID NO:21-40, the methodcomprising a) exposing a sample comprising the target polynucleotide toa compound, b) detecting altered expression of the targetpolynucleotide, and c) comparing the expression of the targetpolynucleotide in the presence of varying amounts of the compound and inthe absence of the compound.

[0070] The invention further provides a method for assessing toxicity ofa test compound, said method comprising a) treating a biological samplecontaining nucleic acids with the test compound; b) hybridizing thenucleic acids of the treated biological sample with a probe comprisingat least 20 contiguous nucleotides of a polynucleotide selected from thegroup consisting of i) a polynucleotide comprising a polynucleotidesequence selected from the group consisting of SEQ ID NO:21-40, ii) apolynucleotide comprising a naturally occurring polynucleotide sequenceat least 90% identical to a polynucleotide sequence selected from thegroup consisting of SEQ ID NO:21-40, iii) a polynucleotide having asequence complementary to i), iv) a polynucleotide complementary to thepolynucleotide of ii), and v) an RNA equivalent of i)-iv). Hybridizationoccurs under conditions whereby a specific hybridization complex isformed between said probe and a target polynucleotide in the biologicalsample, said target polynucleotide selected from the group consisting ofi) a polynucleotide comprising a polynucleotide sequence selected fromthe group consisting of SEQ ID NO:21-40, ii) a polynucleotide comprisinga naturally occurring polynucleotide sequence at least 90% identical toa polynucleotide sequence selected from the group consisting of SEQ IDNO:21-40, iii) a polynucleotide complementary to the polynucleotide ofi), iv) a polynucleotide complementary to the polynucleotide of ii), andv) an RNA equivalent of i)-iv). Alternatively, the target polynucleotidecomprises a fragment of a polynucleotide sequence selected from thegroup consisting of i)-v) above; c) quantifying the amount ofhybridization complex; and d) comparing the amount of hybridizationcomplex in the treated biological sample with the amount ofhybridization complex in an untreated biological sample, wherein adifference in the amount of hybridization complex in the treatedbiological sample is indicative of toxicity of the test compound.

BRIEF DESCRIPTION OF THE TABLES

[0071] Table 1 summarizes the nomenclature for the full lengthpolynucleotide and polypeptide sequences of the present invention.

[0072] Table 2 shows the GenBank identification number and annotation ofthe nearest GenBank homolog, and the PROTEOME database identificationnumbers and annotations of PROTEOME database homologs, for polypeptidesof the invention. The probability scores for the matches between eachpolypeptide and its homolog(s) are also shown.

[0073] Table 3 shows structural features of polypeptide sequences of theinvention, including predicted motifs and domains, along with themethods, algorithms, and searchable databases used for analysis of thepolypeptides.

[0074] Table 4 lists the cDNA and/or genomic DNA fragments which wereused to assemble polynucleotide sequences of the invention, along withselected fragments of the polynucleotide sequences.

[0075] Table 5 shows the representative cDNA library for polynucleotidesof the invention.

[0076] Table 6 provides an appendix which describes the tissues andvectors used for construction of the cDNA libraries shown in Table 5.

[0077] Table 7 shows the tools, programs, and algorithms used to analyzethe polynucleotides and polypeptides of the invention, along withapplicable descriptions, references, and threshold parameters.

DESCRIPTION OF THE INVENTION

[0078] Before the present proteins, nucleotide sequences, and methodsare described, it is understood that this invention is not limited tothe particular machines, materials and methods described, as these mayvary. It is also to be understood that the terminology used herein isfor the purpose of describing particular embodiments only, and is notintended to limit the scope of the present invention which will belimited only by the appended claims.

[0079] It must be noted that as used herein and in the appended claims,the singular forms “a,” “an,” and “the” include plural reference unlessthe context clearly dictates otherwise. Thus, for example, a referenceto “a host cell” includes a plurality of such host cells, and areference to “an antibody” is a reference to one or more antibodies andequivalents thereof known to those skilled in the art, and so forth.

[0080] Unless defined otherwise, all technical and scientific terms usedherein have the same meanings as commonly understood by one of ordinaryskill in the art to which this invention belongs. Although any machines,materials, and methods similar or equivalent to those described hereincan be used to practice or test the present invention, the preferredmachines, materials and methods are now described. All publicationsmentioned herein are cited for the purpose of describing and disclosingthe cell lines, protocols, reagents and vectors which are reported inthe publications and which might be used in connection with theinvention. Nothing herein is to be construed as an admission that theinvention is not entitled to antedate such disclosure by virtue of priorinvention.

DEFINITIONS

[0081] “TRICH” refers to the amino acid sequences of substantiallypurified TRICH obtained from any species, particularly a mammalianspecies, including bovine, ovine, porcine, murine, equine, and human,and from any source, whether natural, synthetic, semi-synthetic, orrecombinant.

[0082] The term “agonist” refers to a molecule which intensifies ormimics the biological activity of TRICH. Agonists may include proteins,nucleic acids, carbohydrates, small molecules, or any other compound orcomposition which modulates the activity of TRICH either by directlyinteracting with TRICH or by acting on components of the biologicalpathway in which TRICH participates.

[0083] An “allelic variant” is an alternative form of the gene encodingTRICH. Allelic variants may result from at least one mutation in thenucleic acid sequence and may result in altered mRNAs or in polypeptideswhose structure or function may or may not be altered. A gene may havenone, one, or many allelic variants of its naturally occurring form.Common mutational changes which give rise to allelic variants aregenerally ascribed to natural deletions, additions, or substitutions ofnucleotides. Each of these types of changes may occur alone, or incombination with the others, one or more times in a given sequence.

[0084] “Altered” nucleic acid sequences encoding TRICH include thosesequences with deletions, insertions, or substitutions of differentnucleotides, resulting in a polypeptide the same as TRICH or apolypeptide with at least one functional characteristic of TRICH.Included within this definition are polymorphisms which may or may notbe readily detectable using a particular oligonucleotide probe of thepolynucleotide encoding TRICH, and improper or unexpected hybridizationto allelic variants, with a locus other than the normal chromosomallocus for the polynucleotide sequence encoding TRICH. The encodedprotein may also be “altered,” and may contain deletions, insertions, orsubstitutions of amino acid residues which produce a silent change andresult in a functionally equivalent TRICH. Deliberate amino acidsubstitutions may be made on the basis of similarity in polarity,charge, solubility, hydrophobicity, hydrophilicity, and/or theamphipathic nature of the residues, as long as the biological orimmunological activity of TRICH is retained. For example, negativelycharged amino acids may include aspartic acid and glutamic acid, andpositively charged amino acids may include lysine and arginine. Aminoacids with uncharged polar side chains having similar hydrophilicityvalues may include: asparagine and glutamine; and serine and threonine.Amino acids with uncharged side chains having similar hydrophilicityvalues may include: leucine, isoleucine, and valine; glycine andalanine; and phenylalanine and tyrosine.

[0085] The terms “amino acid” and “amino acid sequence” refer to anoligopeptide, peptide, polypeptide, or protein sequence, or a fragmentof any of these, and to naturally occurring or synthetic molecules.Where “amino acid sequence” is recited to refer to a sequence of anaturally occurring protein molecule, “amino acid sequence” and liketerms are not meant to limit the amino acid sequence to the completenative amino acid sequence associated with the recited protein molecule.

[0086] “Amplification” relates to the production of additional copies ofa nucleic acid sequence. Amplification is generally carried out usingpolymerase chain reaction (PCR) technologies well known in the art.

[0087] The term “antagonist” refers to a molecule which inhibits orattenuates the biological activity of TRICH. Antagonists may includeproteins such as antibodies, nucleic acids, carbohydrates, smallmolecules, or any other compound or composition which modulates theactivity of TRICH either by directly interacting with TRICH or by actingon components of the biological pathway in which TRICH participates.

[0088] The term “antibody” refers to intact immunoglobulin molecules aswell as to fragments thereof, such as Fab, F(ab′)₂, and Fv fragments,which are capable of binding an epitopic determinant. Antibodies thatbind TRICH polypeptides can be prepared using intact polypeptides orusing fragments containing small peptides of interest as the immunizingantigen. The polypeptide or oligopeptide used to immunize an animal(e.g., a mouse, a rat, or a rabbit) can be derived from the translationof RNA, or synthesized chemically, and can be conjugated to a carrierprotein if desired. Commonly used carriers that are chemically coupledto peptides include bovine serum albumin, thyroglobulin, and keyholelimpet hemocyanin (KLH). The coupled peptide is then used to immunizethe animal.

[0089] The term “antigenic determinant” refers to that region of amolecule (i.e., an epitope) that makes contact with a particularantibody. When a protein or a fragment of a protein is used to immunizea host animal, numerous regions of the protein may induce the productionof antibodies which bind specifically to antigenic determinants(particular regions or three-dimensional structures on the protein). Anantigenic determinant may compete with the intact antigen (i.e., theimmunogen used to elicit the immune response) for binding to anantibody.

[0090] The term “aptamer” refers to a nucleic acid or oligonucleotidemolecule that binds to a specific molecular target. Aptamers are derivedfrom an in vitro evolutionary process (e.g., SELEX (Systematic Evolutionof Ligands by EXponential Enrichment), described in U.S. Pat. No.5,270,163), which selects for target-specific aptamer sequences fromlarge combinatorial libraries. Aptamer compositions may bedouble-stranded or single-stranded, and may includedeoxyribonucleotides, ribonucleotides, nucleotide derivatives, or othernucleotide-like molecules. The nucleotide components of an aptamer mayhave modified sugar groups (e.g., the 2′-OH group of a ribonucleotidemay be replaced by 2′-F or 2′-NH₂), which may improve a desiredproperty, e.g., resistance to nucleases or longer lifetime in blood.Aptamers may be conjugated to other molecules, e.g., a high molecularweight carrier to slow clearance of the aptamer from the circulatorysystem. Aptamers may be specifically cross-linked to their cognateligands, e.g., by photo-activation of a cross-linker. (See, e.g., Brody,E. N. and L. Gold (2000) J. Biotechnol. 74:5-13.)

[0091] The term “intramer” refers to an aptamer which is expressed invivo. For example, a vaccinia virus-based RNA expression system has beenused to express specific RNA aptamers at high levels in the cytoplasm ofleukocytes (Blind, M. et al. (1999) Proc. Natl Acad. Sci. USA96:3606-3610).

[0092] The term “spiegelmer” refers to an aptamer which includes L-DNA,L-RNA, or other left-handed nucleotide derivatives or nucleotide-likemolecules. Aptamers containing left-handed nucleotides are resistant todegradation by naturally occurring enzymes, which normally act onsubstrates containing right-handed nucleotides.

[0093] The term “antisense” refers to any composition capable ofbase-pairing with the “sense” (coding) strand of a specific nucleic acidsequence. Antisense compositions may include DNA; RNA; peptide nucleicacid (PNA); oligonucleotides having modified backbone linkages such asphosphorothioates, methylphosphonates, or benzylphosphonates;oligonucleotides having modified sugar groups such as 2′-methoxyethylsugars or 2′-methoxyethoxy sugars; or oligonucleotides having modifiedbases such as 5-methyl cytosine, 2′-deoxyuracil, or7-deaza-2′-deoxyguanosine. Antisense molecules may be produced by anymethod including chemical synthesis or transcription. Once introducedinto a cell, the complementary antisense molecule base-pairs with anaturally occurring nucleic acid sequence produced by the cell to formduplexes which block either transcription or translation. Thedesignation “negative” or “minus” can refer to the antisense strand, andthe designation “positive” or “plus” can refer to the sense strand of areference DNA molecule.

[0094] The term “biologically active” refers to a protein havingstructural, regulatory, or biochemical functions of a naturallyoccurring molecule. Likewise, “immunologically active” or “immunogenic”refers to the capability of the natural, recombinant, or syntheticTRICH, or of any oligopeptide thereof, to induce a specific immuneresponse in appropriate animals or cells and to bind with specificantibodies.

[0095] “Complementary” describes the relationship between twosingle-stranded nucleic acid sequences that anneal by base-pairing. Forexample, 5′-AGT-3′ pairs with its complement, 3′-TCA-5′.

[0096] A “composition comprising a given polynucleotide sequence” and a“composition comprising a given amino acid sequence” refer broadly toany composition containing the given polynucleotide or amino acidsequence. The composition may comprise a dry formulation or an aqueoussolution. Compositions comprising polynucleotide sequences encodingTRICH or fragments of TRICH may be employed as hybridization probes. Theprobes may be stored in freeze-dried form and may be associated with astabilizing agent such as a carbohydrate. In hybridizations, the probemay be deployed in an aqueous solution containing salts (e.g., NaCl),detergents (e.g., sodium dodecyl sulfate; SDS), and other components(e.g., Denhardt's solution, dry milk, salmon sperm DNA, etc.).

[0097] “Consensus sequence” refers to a nucleic acid sequence which hasbeen subjected to repeated DNA sequence analysis to resolve uncalledbases, extended using the XL-PCR kit (Applied Biosystems, Foster CityCalif.) in the 5′ and/or the 3′ direction, and resequenced, or which hasbeen assembled from one or more overlapping cDNA, EST, or genomic DNAfragments using a computer program for fragment assembly, such as theGELVIEW fragment assembly system (GCG, Madison Wis.) or Phrap(University of Washington, Seattle Wash.). Some sequences have been bothextended and assembled to produce the consensus sequence.

[0098] “Conservative amino acid substitutions” are those substitutionsthat are predicted to least interfere with the properties of theoriginal protein, i.e., the structure and especially the function of theprotein is conserved and not significantly changed by suchsubstitutions. The table below shows amino acids which may besubstituted for an original amino acid in a protein and which areregarded as conservative amino acid substitutions. Original ResidueConservative Substitution Ala Gly, Ser Arg His, Lys Asn Asp, Gln, HisAsp Asn, Glu Cys Ala, Ser Gln Asn, Glu, His Glu Asp, Gln, His Gly AlaHis Asn, Arg, Gln, Glu Ile Leu, Val Leu Ile, Val Lys Arg, Gln, Glu MetLeu, Ile Phe His, Met, Leu, Trp, Tyr Ser Cys, Thr Thr Ser, Val Trp Phe,Tyr Tyr His, Phe, Trp Val Ile, Leu, Thr

[0099] Conservative amino acid substitutions generally maintain (a) thestructure of the polypeptide backbone in the area of the substitution,for example, as a beta sheet or alpha helical conformation, (b) thecharge or hydrophobicity of the molecule at the site of thesubstitution, and/or (c) the bulk of the side chain.

[0100] A “deletion” refers to a change in the amino acid or nucleotidesequence that results in the absence of one or more amino acid residuesor nucleotides.

[0101] The term “derivative” refers to a chemically modifiedpolynucleotide or polypeptide. Chemical modifications of apolynucleotide can include, for example, replacement of hydrogen by analkyl, acyl, hydroxyl, or amino group. A derivative polynucleotideencodes a polypeptide which retains at least one biological orimmunological function of the natural molecule. A derivative polypeptideis one modified by glycosylation, pegylation, or any similar processthat retains at least one biological or immunological function of thepolypeptide from which it was derived.

[0102] A “detectable label” refers to a reporter molecule or enzyme thatis capable of generating a measurable signal and is covalently ornoncovalently joined to a polynucleotide or polypeptide.

[0103] “Differential expression” refers to increased or upregulated; ordecreased, downregulated, or absent gene or protein expression,determined by comparing at least two different samples. Such comparisonsmay be carried out between, for example, a treated and an untreatedsample, or a diseased and a normal sample.

[0104] “Exon shuffling” refers to the recombination of different codingregions (exons). Since an exon may represent a structural or functionaldomain of the encoded protein, new proteins may be assembled through thenovel reassortment of stable substructures, thus allowing accelerationof the evolution of new protein functions.

[0105] A “fragment” is a unique portion of TRICH or the polynucleotideencoding TRICH which is identical in sequence to but shorter in lengththan the parent sequence. A fragment may comprise up to the entirelength of the defined sequence, minus one nucleotide/amino acid residue.For example, a fragment may comprise from 5 to 1000 contiguousnucleotides or amino acid residues. A fragment used as a probe, primer,antigen, therapeutic molecule, or for other purposes, may be at least 5,10, 15, 16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500contiguous nucleotides or amino acid residues in length. Fragments maybe preferentially selected from certain regions of a molecule. Forexample, a polypeptide fragment may comprise a certain length ofcontiguous amino acids selected from the first 250 or 500 amino acids(or first 25% or 50%) of a polypeptide as shown in a certain definedsequence. Clearly these lengths are exemplary, and any length that issupported by the specification, including the Sequence Listing, tables,and figures, may be encompassed by the present embodiments.

[0106] A fragment of SEQ ID NO:21-40 comprises a region of uniquepolynucleotide sequence that specifically identifies SEQ ID NO:21-40,for example, as distinct from any other sequence in the genome fromwhich the fragment was obtained. A fragment of SEQ ID NO:21-40 isuseful, for example, in hybridization and amplification technologies andin analogous methods that distinguish SEQ ID NO:21-40 from relatedpolynucleotide sequences. The precise length of a fragment of SEQ IDNO:21-40 and the region of SEQ ID NO:21-40 to which the fragmentcorresponds are routinely determinable by one of ordinary skill in theart based on the intended purpose for the fragment.

[0107] A fragment of SEQ ID NO:1-20 is encoded by a fragment of SEQ IDNO:21-40. A fragment of SEQ ID NO:1-20 comprises a region of uniqueamino acid sequence that specifically identifies SEQ ID NO:1-20. Forexample, a fragment of SEQ ID NO:1-20 is useful as an immunogenicpeptide for the development of antibodies that specifically recognizeSEQ ID NO:1-20. The precise length of a fragment of SEQ ID NO:1-20 andthe region of SEQ ID NO:1-20 to which the fragment corresponds areroutinely determinable by one of ordinary skill in the art based on theintended purpose for the fragment.

[0108] A “full length” polynucleotide sequence is one containing atleast a translation initiation codon (e.g., methionine) followed by anopen reading frame and a translation termination codon. A “full length”polynucleotide sequence encodes a “full length” polypeptide sequence.

[0109] “Homology” refers to sequence similarity or, interchangeably,sequence identity, between two or more polynucleotide sequences or twoor more polypeptide sequences.

[0110] The terms “percent identity” and “% identity,” as applied topolynucleotide sequences, refer to the percentage of residue matchesbetween at least two polynucleotide sequences aligned using astandardized algorithm. Such an algorithm may insert, in a standardizedand reproducible way, gaps in the sequences being compared in order tooptimize alignment between two sequences, and therefore achieve a moremeaningful comparison of the two sequences.

[0111] Percent identity between polynucleotide sequences may bedetermined using the default parameters of the CLUSTAL V algorithm asincorporated into the MEGALIGN version 3.12e sequence alignment program.This program is part of the LASERGENE software package, a suite ofmolecular biological analysis programs (DNASTAR, Madison Wis.). CLUSTALV is described in Higgins, D. G. and P. M. Sharp (1989) CABIOS 5:151-153and in Higgins, D. G. et al. (1992) CABIOS 8:189-191. For pairwisealignments of polynucleotide sequences, the default parameters are setas follows: Ktuple=2, gap penalty=5, window=4, and “diagonals saved”=4.The “weighted” residue weight table is selected as the default. Percentidentity is reported by CLUSTAL V as the “percent similarity” betweenaligned polynucleotide sequences.

[0112] Alternatively, a suite of commonly used and freely availablesequence comparison algorithms is provided by the National Center forBiotechnology Information (NCBI) Basic Local Alignment Search Tool(BLAST) (Altschul, S. F. et al. (1990) J. Mol. Biol. 215:403-410), whichis available from several sources, including the NCBI, Bethesda, Md.,and on the Internet at http://www.ncbi.nlm.nih.gov/BLAST/. The BLASTsoftware suite includes various sequence analysis programs including“blastn,” that is used to align a known polynucleotide sequence withother polynucleotide sequences from a variety of databases. Alsoavailable is a tool called “BLAST 2 Sequences” that is used for directpairwise comparison of two nucleotide sequences. “BLAST 2 Sequences” canbe accessed and used interactively athttp://www.ncbi.nlm.nih.gov/gorf/bl2.html. The “BLAST 2 Sequences” toolcan be used for both blastn and blastp (discussed below). BLAST programsare commonly used with gap and other parameters set to default settings.For example, to compare two nucleotide sequences, one may use blastnwith the “BLAST 2 Sequences” tool Version 2.0.12 (Apr. 21, 2000) set atdefault parameters. Such default parameters may be, for example:

[0113] Matrix: BLOSUM62

[0114] Reward for match: 1

[0115] Penalty for mismatch: −2

[0116] Open Gap: 5 and Extension Gap: 2 penalties

[0117] Gap x drop-off: 50

[0118] Expect: 10

[0119] Word Size: 11

[0120] Filter: on

[0121] Percent identity may be measured over the length of an entiredefined sequence, for example, as defined by a particular SEQ ID number,or may be measured over a shorter length, for example, over the lengthof a fragment taken from a larger, defined sequence, for instance, afragment of at least 20, at least 30, at least 40, at least 50, at least70, at least 100, or at least 200 contiguous nucleotides. Such lengthsare exemplary only, and it is understood that any fragment lengthsupported by the sequences shown herein, in the tables, figures, orSequence Listing, may be used to describe a length over which percentageidentity may be measured.

[0122] Nucleic acid sequences that do not show a high degree of identitymay nevertheless encode similar amino acid sequences due to thedegeneracy of the genetic code. It is understood that changes in anucleic acid sequence can be made using this degeneracy to producemultiple nucleic acid sequences that all encode substantially the sameprotein.

[0123] The phrases “percent identity” and “% identity,” as applied topolypeptide sequences, refer to the percentage of residue matchesbetween at least two polypeptide sequences aligned using a standardizedalgorithm. Methods of polypeptide sequence alignment are well-known.Some alignment methods take into account conservative amino acidsubstitutions. Such conservative substitutions, explained in more detailabove, generally preserve the charge and hydrophobicity at the site ofsubstitution, thus preserving the structure (and therefore function) ofthe polypeptide.

[0124] Percent identity between polypeptide sequences may be determinedusing the default parameters of the CLUSTAL V algorithm as incorporatedinto the MEGALIGN version 3.12e sequence alignment program (describedand referenced above). For pairwise alignments of polypeptide sequencesusing CLUSTAL V, the default parameters are set as follows: Ktuple=1,gap penalty=3, window=5, and “diagonals saved”=5. The PAM250 matrix isselected as the default residue weight table. As with polynucleotidealignments, the percent identity is reported by CLUSTAL V as the“percent similarity” between aligned polypeptide sequence pairs.

[0125] Alternatively the NCBI BLAST software suite may be used. Forexample, for a pairwise comparison of two polypeptide sequences, one mayuse the “BLAST 2 Sequences” tool Version 2.0.12 (Apr. 21, 2000) withblastp set at default parameters. Such default parameters may be, forexample:

[0126] Matrix: BLOSUM62

[0127] Open Gap: 11 and Extension Gap: 1 penalties

[0128] Gap x drop-off: 50

[0129] Expect: 10

[0130] Word Size: 3

[0131] Filter: on

[0132] Percent identity may be measured over the length of an entiredefined polypeptide sequence, for example, as defined by a particularSEQ ID number, or may be measured over a shorter length, for example,over the length of a fragment taken from a larger, defined polypeptidesequence, for instance, a fragment of at least 15, at least 20, at least30, at least 40, at least 50, at least 70 or at least 150 contiguousresidues. Such lengths are exemplary only, and it is understood that anyfragment length supported by the sequences shown herein, in the tables,figures or Sequence Listing, may be used to describe a length over whichpercentage identity may be measured.

[0133] “Human artificial chromosomes” (HACs) are linear microchromosomeswhich may contain DNA sequences of about 6 kb to 10 Mb in size and whichcontain all of the elements required for chromosome replication,segregation and maintenance.

[0134] The term “humanized antibody” refers to an antibody molecule inwhich the amino acid sequence in the non-antigen binding regions hasbeen altered so that the antibody more closely resembles a humanantibody, and still retains its original binding ability.

[0135] “Hybridization” refers to the process by which a polynucleotidestrand anneals with a complementary strand through base pairing underdefined hybridization conditions. Specific hybridization is anindication that two nucleic acid sequences share a high degree ofcomplementarity. Specific hybridization complexes form under permissiveannealing conditions and remain hybridized after the “washing” step(s).The washing step(s) is particularly important in determining thestringency of the hybridization process, with more stringent conditionsallowing less non-specific binding, i.e., binding between pairs ofnucleic acid strands that are not perfectly matched. Permissiveconditions for annealing of nucleic acid sequences are routinelydeterminable by one of ordinary skill in the art and may be consistentamong hybridization experiments, whereas wash conditions may be variedamong experiments to achieve the desired stringency, and thereforehybridization specificity. Permissive annealing conditions occur, forexample, at 68° C. in the presence of about 6×SSC, about 1% (w/v) SDS,and about 100 μg/ml sheared, denatured salmon sperm DNA.

[0136] Generally, stringency of hybridization is expressed, in part,with reference to the temperature under which the wash step is carriedout. Such wash temperatures are typically selected to be about 5° C. to20° C. lower than the thermal melting point (T_(m)) for the specificsequence at a defined ionic strength and pH. The T_(m) is thetemperature (under defined ionic strength and pH) at which 50% of thetarget sequence hybridizes to a perfectly matched probe. An equation forcalculating T_(m) and conditions for nucleic acid hybridization are wellknown and can be found in Sambrook, J. et al. (1989) Molecular Cloning:A Laboratory Manual, 2^(nd) ed., vol. 1-3, Cold Spring Harbor Press,Plainview N.Y.; specifically see volume 2, chapter 9.

[0137] High stringency conditions for hybridization betweenpolynucleotides of the present invention include wash conditions of 68°C. in the presence of about 0.2×SSC and about 0.1% SDS, for 1 hour.Alternatively, temperatures of about 65° C., 60° C., 55° C., or 42° C.may be used. SSC concentration may be varied from about 0.1 to 2×SSC,with SDS being present at about 0.1%. Typically, blocking reagents areused to block non-specific hybridization. Such blocking reagentsinclude, for instance, sheared and denatured salmon sperm DNA at about100-200 μg/ml. Organic solvent, such as formamide at a concentration ofabout 35-50% v/v, may also be used under particular circumstances, suchas for RNA:DNA hybridizations. Useful variations on these washconditions will be readily apparent to those of ordinary skill in theart. Hybridization, particularly under high stringency conditions, maybe suggestive of evolutionary similarity between the nucleotides. Suchsimilarity is strongly indicative of a similar role for the nucleotidesand their encoded polypeptides.

[0138] The term “hybridization complex” refers to a complex formedbetween two nucleic acid sequences by virtue of the formation ofhydrogen bonds between complementary bases. A hybridization complex maybe formed in solution (e.g., C₀t or R₀t analysis) or formed between onenucleic acid sequence present in solution and another nucleic acidsequence immobilized on a solid support (e.g., paper, membranes,filters, chips, pins or glass slides, or any other appropriate substrateto which cells or their nucleic acids have been fixed).

[0139] The words “insertion” and “addition” refer to changes in an aminoacid or nucleotide sequence resulting in the addition of one or moreamino acid residues or nucleotides, respectively.

[0140] “Immune response” can refer to conditions associated withinflammation, trauma, immune disorders, or infectious or geneticdisease, etc. These conditions can be characterized by expression ofvarious factors, e.g., cytokines, chemokines, and other signalingmolecules, which may affect cellular and systemic defense systems.

[0141] An “immunogenic fragment” is a polypeptide or oligopeptidefragment of TRICH which is capable of eliciting an immune response whenintroduced into a living organism, for example, a mammal. The term“immunogenic fragment” also includes any polypeptide or oligopeptidefragment of TRICH which is useful in any of the antibody productionmethods disclosed herein or known in the art.

[0142] The term “microarray” refers to an arrangement of a plurality ofpolynucleotides, polypeptides, or other chemical compounds on asubstrate.

[0143] The terms “element” and “array element” refer to apolynucleotide, polypeptide, or other chemical compound having a uniqueand defined position on a microarray.

[0144] The term “modulate” refers to a change in the activity of TRICH.For example, modulation may cause an increase or a decrease in proteinactivity, binding characteristics, or any other biological, functional,or immunological properties of TRICH.

[0145] The phrases “nucleic acid” and “nucleic acid sequence” refer to anucleotide, oligonucleotide, polynucleotide, or any fragment thereof.These phrases also refer to DNA or RNA of genomic or synthetic originwhich may be single-stranded or double-stranded and may represent thesense or the antisense strand, to peptide nucleic acid (PNA), or to anyDNA-like or RNA-like material.

[0146] “Operably linked” refers to the situation in which a firstnucleic acid sequence is placed in a functional relationship with asecond nucleic acid sequence. For instance, a promoter is operablylinked to a coding sequence if the promoter affects the transcription orexpression of the coding sequence. Operably linked DNA sequences may bein close proximity or contiguous and, where necessary to join twoprotein coding regions, in the same reading frame.

[0147] “Peptide nucleic acid” (PNA) refers to an antisense molecule oranti-gene agent which comprises an oligonucleotide of at least about 5nucleotides in length linked to a peptide backbone of amino acidresidues ending in lysine. The terminal lysine confers solubility to thecomposition. PNAs preferentially bind complementary single stranded DNAor RNA and stop transcript elongation, and may be pegylated to extendtheir lifespan in the cell.

[0148] “Post-translational modification” of an TRICH may involvelipidation, glycosylation, phosphorylation, acetylation, racemization,proteolytic cleavage, and other modifications known in the art. Theseprocesses may occur synthetically or biochemically. Biochemicalmodifications will vary by cell type depending on the enzymatic milieuof TRICH.

[0149] “Probe” refers to nucleic acid sequences encoding TRICH, theircomplements, or fragments thereof, which are used to detect identical,allelic or related nucleic acid sequences. Probes are isolatedoligonucleotides or polynucleotides attached to a detectable label orreporter molecule. Typical labels include radioactive isotopes, ligands,chemiluminescent agents, and enzymes. “Primers” are short nucleic acids,usually DNA oligonucleotides, which may be annealed to a targetpolynucleotide by complementary base-pairing. The primer may then beextended along the target DNA strand by a DNA polymerase enzyme. Primerpairs can be used for amplification (and identification) of a nucleicacid sequence, e.g., by the polymerase chain reaction (PCR).

[0150] Probes and primers as used in the present invention typicallycomprise at least 15 contiguous nucleotides of a known sequence. Inorder to enhance specificity, longer probes and primers may also beemployed, such as probes and primers that comprise at least 20, 25, 30,40, 50, 60, 70, 80, 90, 100, or at least 150 consecutive nucleotides ofthe disclosed nucleic acid sequences. Probes and primers may beconsiderably longer than these examples, and it is understood that anylength supported by the specification, including the tables, figures,and Sequence Listing, may be used.

[0151] Methods for preparing and using probes and primers are describedin the references, for example Sambrook, J. et al. (1989) MolecularCloning: A Laboratory Manual, 2^(nd) ed., vol. 1-3, Cold Spring HarborPress, Plainview N.Y.; Ausubel, F. M. et al. (1987) Current Protocols inMolecular Biology, Greene Publ. Assoc. & Wiley-Intersciences, New YorkN.Y.; Innis, M. et al. (1990) PCR Protocols, A Guide to Methods andApplications, Academic Press, San Diego Calif. PCR primer pairs can bederived from a known sequence, for example, by using computer programsintended for that purpose such as Primer (Version 0.5, 1991, WhiteheadInstitute for Biomedical Research, Cambridge Mass.).

[0152] Oligonucleotides for use as primers are selected using softwareknown in the art for such purpose. For example, OLIGO 4.06 software isuseful for the selection of PCR primer pairs of up to 100 nucleotideseach, and for the analysis of oligonucleotides and largerpolynucleotides of up to 5,000 nucleotides from an input polynucleotidesequence of up to 32 kilobases. Similar primer selection programs haveincorporated additional features for expanded capabilities. For example,the PrimOU primer selection program (available to the public from theGenome Center at University of Texas South West Medical Center, DallasTex.) is capable of choosing specific primers from megabase sequencesand is thus useful for designing primers on a genome-wide scope. ThePrimer3 primer selection program (available to the public from theWhitehead Institute/MIT Center for Genome Research, Cambridge Mass.)allows the user to input a “mispriming library,” in which sequences toavoid as primer binding sites are user-specified. Primer3 is useful, inparticular, for the selection of oligonucleotides for microarrays. (Thesource code for the latter two primer selection programs may also beobtained from their respective sources and modified to meet the user'sspecific needs.) The PrimeGen program (available to the public from theUK Human Genome Mapping Project Resource Centre, Cambridge UK) designsprimers based on multiple sequence alignments, thereby allowingselection of primers that hybridize to either the most conserved orleast conserved regions of aligned nucleic acid sequences. Hence, thisprogram is useful for identification of both unique and conservedoligonucleotides and polynucleotide fragments. The oligonucleotides andpolynucleotide fragments identified by any of the above selectionmethods are useful in hybridization technologies, for example, as PCR orsequencing primers, microarray elements, or specific probes to identifyfully or partially complementary polynucleotides in a sample of nucleicacids. Methods of oligonucleotide selection are not limited to thosedescribed above.

[0153] A “recombinant nucleic acid” is a sequence that is not naturallyoccurring or has a sequence that is made by an artificial combination oftwo or more otherwise separated segments of sequence. This artificialcombination is often accomplished by chemical synthesis or, morecommonly, by the artificial manipulation of isolated segments of nucleicacids, e.g., by genetic engineering techniques such as those describedin Sambrook, supra. The term recombinant includes nucleic acids thathave been altered solely by addition, substitution, or deletion of aportion of the nucleic acid. Frequently, a recombinant nucleic acid mayinclude a nucleic acid sequence operably linked to a promoter sequence.Such a recombinant nucleic acid may be part of a vector that is used,for example, to transform a cell.

[0154] Alternatively, such recombinant nucleic acids may be part of aviral vector, e.g., based on a vaccinia virus, that could be use tovaccinate a mammal wherein the recombinant nucleic acid is expressed,inducing a protective immunological response in the mammal.

[0155] A “regulatory element” refers to a nucleic acid sequence usuallyderived from untranslated regions of a gene and includes enhancers,promoters, introns, and 5′ and 3′ untranslated regions (UTRs).Regulatory elements interact with host or viral proteins which controltranscription, translation, or RNA stability.

[0156] “Reporter molecules” are chemical or biochemical moieties usedfor labeling a nucleic acid, amino acid, or antibody. Reporter moleculesinclude radionuclides; enzymes; fluorescent, chemiluminescent, orchromogenic agents; substrates; cofactors; inhibitors; magneticparticles; and other moieties known in the art.

[0157] An “RNA equivalent,” in reference to a DNA sequence, is composedof the same linear sequence of nucleotides as the reference DNA sequencewith the exception that all occurrences of the nitrogenous base thymineare replaced with uracil, and the sugar backbone is composed of riboseinstead of deoxyribose.

[0158] The term “sample” is used in its broadest sense. A samplesuspected of containing TRICH, nucleic acids encoding TRICH, orfragments thereof may comprise a bodily fluid; an extract from a cell,chromosome, organelle, or membrane isolated from a cell; a cell; genomicDNA, RNA, or cDNA, in solution or bound to a substrate; a tissue; atissue print; etc.

[0159] The terms “specific binding” and “specifically binding” refer tothat interaction between a protein or peptide and an agonist, anantibody, an antagonist, a small molecule, or any natural or syntheticbinding composition. The interaction is dependent upon the presence of aparticular structure of the protein, e.g., the antigenic determinant orepitope, recognized by the binding molecule. For example, if an antibodyis specific for epitope “A,” the presence of a polypeptide comprisingthe epitope A, or the presence of free unlabeled A, in a reactioncontaining free labeled A and the antibody will reduce the amount oflabeled A that binds to the antibody.

[0160] The term “substantially purified” refers to nucleic acid or aminoacid sequences that are removed from their natural environment and areisolated or separated, and are at least 60% free, preferably at least75% free, and most preferably at least 90% free from other componentswith which they are naturally associated.

[0161] A “substitution” refers to the replacement of one or more aminoacid residues or nucleotides by different amino acid residues ornucleotides, respectively.

[0162] “Substrate” refers to any suitable rigid or semi-rigid supportincluding membranes, filters, chips, slides, wafers, fibers, magnetic ornonmagnetic beads, gels, tubing, plates, polymers, microparticles andcapillaries. The substrate can have a variety of surface forms, such aswells, trenches, pins, channels and pores, to which polynucleotides orpolypeptides are bound.

[0163] A “transcript image” or “expression profile” refers to thecollective pattern of gene expression by a particular cell type ortissue under given conditions at a given time.

[0164] “Transformation” describes a process by which exogenous DNA isintroduced into a recipient cell. Transformation may occur under naturalor artificial conditions according to various methods well known in theart, and may rely on any known method for the insertion of foreignnucleic acid sequences into a prokaryotic or eukaryotic host cell. Themethod for transformation is selected based on the type of host cellbeing transformed and may include, but is not limited to, bacteriophageor viral infection, electroporation, heat shock, lipofection, andparticle bombardment. The term “transformed cells” includes stablytransformed cells in which the inserted DNA is capable of replicationeither as an autonomously replicating plasmid or as part of the hostchromosome, as well as transiently transformed cells which express theinserted DNA or RNA for limited periods of time.

[0165] A “transgenic organism,” as used herein, is any organism,including but not limited to animals and plants, in which one or more ofthe cells of the organism contains heterologous nucleic acid introducedby way of human intervention, such as by transgenic techniques wellknown in the art. The nucleic acid is introduced into the cell, directlyor indirectly by introduction into a precursor of the cell, by way ofdeliberate genetic manipulation, such as by microinjection or byinfection with a recombinant virus. The term genetic manipulation doesnot include classical cross-breeding, or in vitro fertilization, butrather is directed to the introduction of a recombinant DNA molecule.The transgenic organisms contemplated in accordance with the presentinvention include bacteria, cyanobacteria, fungi, plants and animals.The isolated DNA of the present invention can be introduced into thehost by methods known in the art, for example infection, transfection,transformation or transconjugation. Techniques for transferring the DNAof the present invention into such organisms are widely known andprovided in references such as Sambrook et al. (1989), supra.

[0166] A “variant” of a particular nucleic acid sequence is defined as anucleic acid sequence having at least 40% sequence identity to theparticular nucleic acid sequence over a certain length of one of thenucleic acid sequences using blastn with the “BLAST 2 Sequences” toolVersion 2.0.9 (May 7, 1999) set at default parameters. Such a pair ofnucleic acids may show, for example, at least 50%, at least 60%, atleast 70%, at least 80%, at least 85%, at least 90%, at least 91%, atleast 92%, at least 93%, at least 94%, at least 95%, at least 96%, atleast 97%, at least 98%, or at least 99% or greater sequence identityover a certain defined length. A variant may be described as, forexample, an “allelic” (as defined above), “splice,” “species,” or“polymorphic” variant. A splice variant may have significant identity toa reference molecule, but will generally have a greater or lesser numberof polynucleotides due to alternate splicing of exons during mRNAprocessing. The corresponding polypeptide may possess additionalfunctional domains or lack domains that are present in the referencemolecule. Species variants are polynucleotide sequences that vary fromone species to another. The resulting polypeptides will generally havesignificant amino acid identity relative to each other. A polymorphicvariant is a variation in the polynucleotide sequence of a particulargene between individuals of a given species. Polymorphic variants alsomay encompass “single nucleotide polymorphisms” (SNPs) in which thepolynucleotide sequence varies by one nucleotide base. The presence ofSNPs may be indicative of, for example, a certain population, a diseasestate, or a propensity for a disease state.

[0167] A “variant” of a particular polypeptide sequence is defined as apolypeptide sequence having at least 40% sequence identity to theparticular polypeptide sequence over a certain length of one of thepolypeptide sequences using blastp with the “BLAST 2 Sequences” toolVersion 2.0.9 (May 7, 1999) set at default parameters. Such a pair ofpolypeptides may show, for example, at least 50%, at least 60%, at least70%, at least 80%, at least 90%, at least 91%, at least 92%, at least93%, at least 94%, at least 95%, at least 96%, at least 97%, at least98%, or at least 99% or greater sequence identity over a certain definedlength of one of the polypeptides.

THE INVENTION

[0168] The invention is based on the discovery of new human transportersand ion channels (TRICH), the polynucleotides encoding TRICH, and theuse of these compositions for the diagnosis, treatment, or prevention oftransport, neurological, muscle, immunological and cell proliferativedisorders.

[0169] Table 1 summarizes the nomenclature for the full lengthpolynucleotide and polypeptide sequences of the invention. Eachpolynucleotide and its corresponding polypeptide are correlated to asingle Incyte project identification number (Incyte Project ID). Eachpolypeptide sequence is denoted by both a polypeptide sequenceidentification number (Polypeptide SEQ ID NO:) and an Incyte polypeptidesequence number (Incyte Polypeptide ID) as shown. Each polynucleotidesequence is denoted by both a polynucleotide sequence identificationnumber (Polynucleotide SEQ ID NO:) and an Incyte polynucleotideconsensus sequence number (Incyte Polynucleotide ID) as shown.

[0170] Table 2 shows sequences with homology to the polypeptides of theinvention as identified by BLAST analysis against the GenBank protein(genpept) database and the PROTEOME database. Columns 1 and 2 show thepolypeptide sequence identification number (Polypeptide SEQ ID NO:) andthe corresponding Incyte polypeptide sequence number (Incyte PolypeptideID) for polypeptides of the invention. Column 3 shows the GenBankidentification number (GenBank ID NO:) of the nearest GenBank homologand the PROTEOME database identification numbers (PROTEOME ID NO:) ofthe nearest PROTEOME database homologs. Column 4 shows the probabilityscores for the matches between each polypeptide and its homolog(s).Column 5 shows the annotation of the GenBank and PROTEOME databasehomolog(s) along with relevant citations where applicable, all of whichare expressly incorporated by reference herein.

[0171] Table 3 shows various structural features of the polypeptides ofthe invention. Columns 1 and 2 show the polypeptide sequenceidentification number (SEQ ID NO:) and the corresponding Incytepolypeptide sequence number (Incyte Polypeptide ID) for each polypeptideof the invention. Column 3 shows the number of amino acid residues ineach polypeptide. Column 4 shows potential phosphorylation sites, andcolumn 5 shows potential glycosylation sites, as determined by theMOTIFS program of the GCG sequence analysis software package (GeneticsComputer Group, Madison Wis.). Column 6 shows amino acid residuescomprising signature sequences, domains, and motifs. Column 7 showsanalytical methods for protein structure/function analysis and in somecases, searchable databases to which the analytical methods wereapplied.

[0172] Together, Tables 2 and 3 summarize the properties of polypeptidesof the invention, and these properties establish that the claimedpolypeptides are transporters and ion channels. For example, SEQ ID NO:3is 85% identical, from residue M27 to residue N989, to rabbit anionexchanger 4a (GenBank ID g11611537) as determined by the Basic LocalAlignment Search Tool (BLAST). (See Table 2.) The BLAST probabilityscore is 0.0, which indicates the probability of obtaining the observedpolypeptide sequence alignment by chance. SEQ ID NO:3 also contains aHCO³-transporter family domain as determined by searching forstatistically significant matches in the hidden Markov model (HMM)-basedPFAM database of conserved protein family domains. (See Table 3.) Datafrom BLIMPS and PROFILESCAN analyses provide further corroborativeevidence that SEQ ID NO:3 is an anion exchanger.

[0173] In another example, SEQ ID NO:6 is 47% identical, from residue S7to residue E350, to hamster Na+ dependent ileal bile acid transporter(GenBank ID g455033) as determined by the Basic Local Alignment SearchTool (BLAST). (See Table 2.) The BLAST probability score is 3.7e-88,which indicates the probability of obtaining the observed polypeptidesequence alignment by chance. SEQ ID NO:6 also contains a sodium bileacid symporter family domain as determined by searching forstatistically significant matches in the hidden Markov model (HMM)-basedPFAM database of conserved protein family domains. (See Table 3.) Datafrom additional BLAST analyses using the PRODOM and DOMO databasesprovide further corroborative evidence that SEQ ID NO:6 is a sodium/bileacid symporter.

[0174] In another example, SEQ ID NO:9 is 68% identical, from residue E6to residue 1349, to mouse Ac39/physophilin, a subunit of the vacuolarATPase (GenBank ID g1226235) as determined by the Basic Local AlignmentSearch Tool (BLAST). (See Table 2.) The BLAST probability score is3.2e-130, which indicates the probability of obtaining the observedpolypeptide sequence alignment by chance. SEQ ID NO:9 also contains anATP synthase (C/AC39) subunit domain as determined by searching forstatistically significant matches in the hidden Markov model (HMM)-basedPFAM database of conserved protein family domains. (See Table 3.) Datafrom additional BLAST analyses using the PRODOM and DOMO databasesprovide further corroborative evidence that SEQ ID NO:9 is a vacuolarATPase subunit.

[0175] In another example, SEQ ID NO:10 is 83% identical, from residueM154 to residue R591, to murine melastatin (GenBank ID g3047272) asdetermined by the Basic Local Alignment Search Tool (BLAST). (See Table2.) The BLAST probability score is 8.6e-20⁰, which indicates theprobability of obtaining the observed polypeptide sequence alignment bychance. SEQ ID NO:10 also contains a transient receptor domain asdetermined by searching for statistically significant matches in thehidden Markov model (HMM)-based PFAM database of conserved proteinfamily domains. (See Table 3.) Data from BLIMPS, analysis providefurther corroborative evidence that SEQ ID NO:10 is a calcium ionchannel (note that melastatin has homology to members of the “transientreceptor” family of “calcium channels”).

[0176] In another example, SEQ ID NO:12 is 51% identical, from residueG761 to residue E1326, to rat multidrug resistance protein MRP5 (GenBankID g6682827) as determined by the Basic Local Alignment Search Tool(BLAST). (See Table 2.) The BLAST probability score is 3.5e-236, whichindicates the probability of obtaining the observed polypeptide sequencealignment by chance. SEQ ID NO:12 also contains two ABC transportertransmembrane regions and two ABC transporter domains as determined bysearching for statistically significant matches in the hidden Markovmodel (HMM)-based PFAM database of conserved protein family domains.(See Table 3.) Data from BLIMPS, MOTIFS, and PROFILESCAN analysesprovide further corroborative evidence that SEQ ID NO:12 is an ABCtransporter.

[0177] For example, SEQ ID NO:18 is 76% identical, from residue M1 toresidue D597, to rat renal osmotic stress-induced Na—Cl organic solutecotransporter (GenBank ID g531469) as determined by the Basic LocalAlignment Search Tool (BLAST). (See Table 2.) The BLAST probabilityscore is 1.2e-260, which indicates the probability of obtaining theobserved polypeptide sequence alignment by chance. SEQ ID NO:18 alsocontains a sodium:neurotransmitter symporter family domain as determinedby searching for statistically significant matches in the hidden Markovmodel (HMM)-based PFAM database of conserved protein family domains.(See Table 3.) Data from BLIMPS and PROFILESCAN analyses provide furthercorroborative evidence that SEQ ID NO:18 is a sodium dependent organicsolute transporter. SEQ ID NO:1-2, SEQ ID NO:4-5, SEQ ID NO:7-8, SEQ IDNO:11, SEQ ID NO:13-17 and SEQ ID NO:19-20 were analyzed and annotatedin a similar manner. The algorithms and parameters for the analysis ofSEQ ID NO:1-20 are described in Table 7.

[0178] As shown in Table 4, the full length polynucleotide sequences ofthe present invention were assembled using cDNA sequences or coding(exon) sequences derived from genomic DNA, or any combination of thesetwo types of sequences. Column 1 lists the polynucleotide sequenceidentification number (Polynucleotide SEQ ID NO:), the correspondingIncyte polynucleotide consensus sequence number (Incyte ID) for eachpolynucleotide of the invention, and the length of each polynucleotidesequence in basepairs. Column 2 shows the nucleotide start (5′) and stop(3′) positions of the cDNA and/or genomic sequences used to assemble thefull length polynucleotide sequences of the invention, and of fragmentsof the polynucleotide sequences which are useful, for example, inhybridization or amplification technologies that identify SEQ IDNO:21-40 or that distinguish between SEQ ID NO:21-40 and relatedpolynucleotide sequences.

[0179] The polynucleotide fragments described in Column 2 of Table 4 mayrefer specifically, for example, to Incyte cDNAs derived fromtissue-specific cDNA libraries or from pooled cDNA libraries.Alternatively, the polynucleotide fragments described in column 2 mayrefer to GenBank cDNAs or ESTs which contributed to the assembly of thefull length polynucleotide sequences. In addition, the polynucleotidefragments described in column 2 may identify sequences derived from theENSEMBL (The Sanger Centre, Cambridge, UK) database (i.e., thosesequences including the designation “ENST”). Alternatively, thepolynucleotide fragments described in column 2 may be derived from theNCBI RefSeq Nucleotide Sequence Records Database (i.e., those sequencesincluding the designation “NM” or “NT”) or the NCBI RefSeq ProteinSequence Records (i.e., those sequences including the designation “NP”).Alternatively, the polynucleotide fragments described in column 2 mayrefer to assemblages of both cDNA and Genscan-predicted exons broughttogether by an “exon stitching” algorithm. For example, a polynucleotidesequence identified as FL_XXXXXX_N_(1—)N_(2—)YYYYY_N_(3—)N₄ represents a“stitched” sequence in which XXXXXX is the identification number of thecluster of sequences to which the algorithm was applied, and YYYYY isthe number of the prediction generated by the algorithm, andN_(1, 2, 3 . . .) , if present, represent specific exons that may havebeen manually edited during analysis (See Example V). Alternatively, thepolynucleotide fragments in column 2 may refer to assemblages of exonsbrought together by an “exon-stretching” algorithm. For example, apolynucleotide sequence identified as FLXXXXXX_gAAAAA_gBBBBB_(—)1_N is a“stretched” sequence, with XXXXXX being the Incyte projectidentification number, gAAAAA being the GenBank identification number ofthe human genomic sequence to which the “exon-stretching” algorithm wasapplied, gBBBBB being the GenBank identification number or NCBI RefSeqidentification number of the nearest GenBank protein homolog, and Nreferring to specific exons (See Example V). In instances where a RefSeqsequence was used as a protein homolog for the “exon-stretching”algorithm, a RefSeq identifier (denoted by “NM,” “NP,” or “NT”) may beused in place of the GenBank identifier (i.e., gBBBBB).

[0180] Alternatively, a prefix identifies component sequences that werehand-edited, predicted from genomic DNA sequences, or derived from acombination of sequence analysis methods. The following Table listsexamples of component sequence prefixes and corresponding sequenceanalysis methods associated with the prefixes (see Example IV andExample V). Prefix Type of analysis and/or examples of programs GNN,GFG, Exon prediction from genomic sequences using, for ENST example,GENSCAN (Stanford University, CA, USA) or FGENES (Computer GenomicsGroup, The Sanger Centre, Cambridge, UK). GBI Hand-edited analysis ofgenomic sequences. FL Stitched or stretched genomic sequences (seeExample V). INCY Full length transcript and exon prediction from mappingof EST sequences to the genome. Genomic location and EST compositiondata are combined to predict the exons and resulting transcript.

[0181] In some cases, Incyte cDNA coverage redundant with the sequencecoverage shown in Table 4 was obtained to confirm the final consensuspolynucleotide sequence, but the relevant Incyte cDNA identificationnumbers are not shown.

[0182] Table 5 shows the representative cDNA libraries for those fulllength polynucleotide sequences which were assembled using Incyte cDNAsequences. The representative cDNA library is the Incyte cDNA librarywhich is most frequently represented by the Incyte cDNA sequences whichwere used to assemble and confirm the above polynucleotide sequences.The tissues and vectors which were used to construct the cDNA librariesshown in Table 5 are described in Table 6.

[0183] The invention also encompasses TRICH variants. A preferred TRICHvariant is one which has at least about 80%, or alternatively at leastabout 90%, or even at least about 95% amino acid sequence identity tothe TRICH amino acid sequence, and which contains at least onefunctional or structural characteristic of TRICH.

[0184] The invention also encompasses polynucleotides which encodeTRICH. In a particular embodiment, the invention encompasses apolynucleotide sequence comprising a sequence selected from the groupconsisting of SEQ ID NO:21-40, which encodes TRICH. The polynucleotidesequences of SEQ ID NO:21-40, as presented in the Sequence Listing,embrace the equivalent RNA sequences, wherein occurrences of thenitrogenous base thymine are replaced with uracil, and the sugarbackbone is composed of ribose instead of deoxyribose.

[0185] The invention also encompasses a variant of a polynucleotidesequence encoding TRICH. In particular, such a variant polynucleotidesequence will have at least about 70%, or alternatively at least about85%, or even at least about 95% polynucleotide sequence identity to thepolynucleotide sequence encoding TRICH. A particular aspect of theinvention encompasses a variant of a polynucleotide sequence comprisinga sequence selected from the group consisting of SEQ ID NO:21-40 whichhas at least about 70%, or alternatively at least about 85%, or even atleast about 95% polynucleotide sequence identity to a nucleic acidsequence selected from the group consisting of SEQ ID NO:21-40. Any oneof the polynucleotide variants described above can encode an amino acidsequence which contains at least one functional or structuralcharacteristic of TRICH.

[0186] In addition, or in the alternative, a polynucleotide variant ofthe invention is a splice variant of a polynucleotide sequence encodingTRICH. A splice variant may have portions which have significantsequence identity to the polynucleotide sequence encoding TRICH, butwill generally have a greater or lesser number of polynucleotides due toadditions or deletions of blocks of sequence arising from alternatesplicing of exons during mRNA processing. A splice variant may have lessthan about 70%, or alternatively less than about 60%, or alternativelyless than about 50% polynucleotide sequence identity to thepolynucleotide sequence encoding TRICH over its entire length; however,portions of the splice variant will have at least about 70%, oralternatively at least about 85%, or alternatively at least about 95%,or alternatively 100% polynucleotide sequence identity to portions ofthe polynucleotide sequence encoding TRICH. For example, apolynucleotide comprising a sequence of SEQ ID NO:40 is a splice variantof a polynucleotide comprising a sequence of SEQ ID NO:29. Any one ofthe splice variants described above can encode an amino acid sequencewhich contains at least one functional or structural characteristic ofTRICH.

[0187] It will be appreciated by those skilled in the art that as aresult of the degeneracy of the genetic code, a multitude ofpolynucleotide sequences encoding TRICH, some bearing minimal similarityto the polynucleotide sequences of any known and naturally occurringgene, may be produced. Thus, the invention contemplates each and everypossible variation of polynucleotide sequence that could be made byselecting combinations based on possible codon choices. Thesecombinations are made in accordance with the standard triplet geneticcode as applied to the polynucleotide sequence of naturally occurringTRICH, and all such variations are to be considered as beingspecifically disclosed.

[0188] Although nucleotide sequences which encode TRICH and its variantsare generally capable of hybridizing to the nucleotide sequence of thenaturally occurring TRICH under appropriately selected conditions ofstringency, it may be advantageous to produce nucleotide sequencesencoding TRICH or its derivatives possessing a substantially differentcodon usage, e.g., inclusion of non-naturally occurring codons. Codonsmay be selected to increase the rate at which expression of the peptideoccurs in a particular prokaryotic or eukaryotic host in accordance withthe frequency with which particular codons are utilized by the host.Other reasons for substantially altering the nucleotide sequenceencoding TRICH and its derivatives without altering the encoded aminoacid sequences include the production of RNA transcripts having moredesirable properties, such as a greater half-life, than transcriptsproduced from the naturally occurring sequence.

[0189] The invention also encompasses production of DNA sequences whichencode TRICH and TRICH derivatives, or fragments thereof, entirely bysynthetic chemistry. After production, the synthetic sequence may beinserted into any of the many available expression vectors and cellsystems using reagents well known in the art. Moreover, syntheticchemistry may be used to introduce mutations into a sequence encodingTRICH or any fragment thereof.

[0190] Also encompassed by the invention are polynucleotide sequencesthat are capable of hybridizing to the claimed polynucleotide sequences,and, in particular, to those shown in SEQ ID NO:21-40 and fragmentsthereof under various conditions of stringency. (See, e.g., Wahl, G. M.and S. L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A. R.(1987) Methods Enzymol. 152:507-511.) Hybridization conditions,including annealing and wash conditions, are described in “Definitions.”

[0191] Methods for DNA sequencing are well known in the art and may beused to practice any of the embodiments of the invention. The methodsmay employ such enzymes as the Klenow fragment of DNA polymerase I,SEQUENASE (US Biochemical, Cleveland Ohio), Taq polymerase (AppliedBiosystems), thermostable T7 polymerase (Amersham Pharmacia Biotech,Piscataway N.J.), or combinations of polymerases and proofreadingexonucleases such as those found in the ELONGASE amplification system(Life Technologies, Gaithersburg Md.). Preferably, sequence preparationis automated with machines such as the MICROLAB 2200 liquid transfersystem (Hamilton, Reno Nev.), PTC200 thermal cycler (MJ Research,Watertown Mass.) and ABI CATALYST 800 thermal cycler (AppliedBiosystems). Sequencing is then carried out using either the ABI 373 or377 DNA sequencing system (Applied Biosystems), the MEGABACE 1000 DNAsequencing system (Molecular Dynamics, Sunnyvale Calif.), or othersystems known in the art. The resulting sequences are analyzed using avariety of algorithms which are well known in the art. (See, e.g.,Ausubel, F. M. (1997) Short Protocols in Molecular Biology, John Wiley &Sons, New York N.Y., unit 7.7; Meyers, R. A. (1995) Molecular Biologyand Biotechnology, Wiley VCH, New York N.Y., pp. 856-853.)

[0192] The nucleic acid sequences encoding TRICH may be extendedutilizing a partial nucleotide sequence and employing various PCR-basedmethods known in the art to detect upstream sequences, such as promotersand regulatory elements. For example, one method which may be employed,restriction-site PCR, uses universal and nested primers to amplifyunknown sequence from genomic DNA within a cloning vector. (See, e.g.,Sarkar, G. (1993) PCR Methods Applic. 2:318-322.) Another method,inverse PCR, uses primers that extend in divergent directions to amplifyunknown sequence from a circularized template. The template is derivedfrom restriction fragments comprising a known genomic locus andsurrounding sequences. (See, e.g., Triglia, T. et al. (1988) NucleicAcids Res. 16:8186.) A third method, capture PCR, involves PCRamplification of DNA fragments adjacent to known sequences in human andyeast artificial chromosome DNA. (See, e.g., Lagerstrom, M. et al.(1991) PCR Methods Applic. 1:111-119.) In this method, multiplerestriction enzyme digestions and ligations may be used to insert anengineered double-stranded sequence into a region of unknown sequencebefore performing PCR. Other methods which may be used to retrieveunknown sequences are known in the art. (See, e.g., Parker, J. D. et al.(1991) Nucleic Acids Res. 19:3055-3060). Additionally, one may use PCR,nested primers, and PROMOTERFINDER libraries (Clontech, Palo AltoCalif.) to walk genomic DNA. This procedure avoids the need to screenlibraries and is useful in finding intron/exon junctions. For allPCR-based methods, primers may be designed using commercially availablesoftware, such as OLIGO 4.06 primer analysis software (NationalBiosciences, Plymouth Minn.) or another appropriate program, to be about22 to 30 nucleotides in length, to have a GC content of about 50% ormore, and to anneal to the template at temperatures of about 68° C. to72° C.

[0193] When screening for full length cDNAs, it is preferable to uselibraries that have been size-selected to include larger cDNAs. Inaddition, random-primed libraries, which often include sequencescontaining the 5′ regions of genes, are preferable for situations inwhich an oligo d(I) library does not yield a full-length cDNA. Genomiclibraries may be useful for extension of sequence into 5′non-transcribed regulatory regions.

[0194] Capillary electrophoresis systems which are commerciallyavailable may be used to analyze the size or confirm the nucleotidesequence of sequencing or PCR products. In particular, capillarysequencing may employ flowable polymers for electrophoretic separation,four different nucleotide-specific, laser-stimulated fluorescent dyes,and a charge coupled device camera for detection of the emittedwavelengths. Output/light intensity may be converted to electricalsignal using appropriate software (e.g., GENOTYPER and SEQUENCENAVIGATOR, Applied Biosystems), and the entire process from loading ofsamples to computer analysis and electronic data display may be computercontrolled. Capillary electrophoresis is especially preferable forsequencing small DNA fragments which may be present in limited amountsin a particular sample.

[0195] In another embodiment of the invention, polynucleotide sequencesor fragments thereof which encode TRICH may be cloned in recombinant DNAmolecules that direct expression of TRICH, or fragments or functionalequivalents thereof, in appropriate host cells. Due to the inherentdegeneracy of the genetic code, other DNA sequences which encodesubstantially the same or a functionally equivalent amino acid sequencemay be produced and used to express TRICH.

[0196] The nucleotide sequences of the present invention can beengineered using methods generally known in the art in order to alterTRICH-encoding sequences for a variety of purposes including, but notlimited to, modification of the cloning, processing, and/or expressionof the gene product. DNA shuffling by random fragmentation and PCRreassembly of gene fragments and synthetic oligonucleotides may be usedto engineer the nucleotide sequences. For example,oligonucleotide-mediated site-directed mutagenesis may be used tointroduce mutations that create new restriction sites, alterglycosylation patterns, change codon preference, produce splicevariants, and so forth.

[0197] The nucleotides of the present invention may be subjected to DNAshuffling techniques such as MOLECULARBREEDING (Maxygen Inc., SantaClara Calif.; described in U.S. Pat. No. 5,837,458; Chang, C.-C. et al.(1999) Nat. Biotechnol. 17:793-797; Christians, F. C. et al. (1999) Nat.Biotechnol. 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol.14:315-319) to alter or improve the biological properties of TRICH, suchas its biological or enzymatic activity or its ability to bind to othermolecules or compounds. DNA shuffling is a process by which a library ofgene variants is produced using PCR-mediated recombination of genefragments. The library is then subjected to selection or screeningprocedures that identify those gene variants with the desiredproperties. These preferred variants may then be pooled and furthersubjected to recursive rounds of DNA shuffling and selection/screening.Thus, genetic diversity is created through “artificial” breeding andrapid molecular evolution. For example, fragments of a single genecontaining random point mutations may be recombined, screened, and thenreshuffled until the desired properties are optimized. Alternatively,fragments of a given gene may be recombined with fragments of homologousgenes in the same gene family, either from the same or differentspecies, thereby maximizing the genetic diversity of multiple naturallyoccurring genes in a directed and controllable manner.

[0198] In another embodiment, sequences encoding TRICH may besynthesized, in whole or in part, using chemical methods well known inthe art. (See, e.g., Caruthers, M. H. et al. (1980) Nucleic Acids Symp.Ser. 7:215-223; and Horn, T. et al. (1980) Nucleic Acids Symp. Ser.7:225-232.) Alternatively, TRICH itself or a fragment thereof may besynthesized using chemical methods. For example, peptide synthesis canbe performed using various solution-phase or solid-phase techniques.(See, e.g., Creighton, T. (1984) Proteins, Structures and MolecularProperties, W H Freeman, New York N.Y., pp. 55-60; and Roberge, J. Y. etal. (1995) Science 269:202-204.) Automated synthesis may be achievedusing the ABI 431A peptide synthesizer (Applied Biosystems).Additionally, the amino acid sequence of TRICH, or any part thereof, maybe altered during direct synthesis and/or combined with sequences fromother proteins, or any part thereof, to produce a variant polypeptide ora polypeptide having a sequence of a naturally occurring polypeptide.

[0199] The peptide may be substantially purified by preparative highperformance liquid chromatography. (See, e.g., Chiez, R. M. and F. Z.Regnier (1990) Methods Enzymol. 182:392-421.) The composition of thesynthetic peptides may be confirmed by amino acid analysis or bysequencing. (See, e.g., Creighton, supra, pp. 28-53.)

[0200] In order to express a biologically active TRICH, the nucleotidesequences encoding TRICH or derivatives thereof may be inserted into anappropriate expression vector, i.e., a vector which contains thenecessary elements for transcriptional and translational control of theinserted coding sequence in a suitable host. These elements includeregulatory sequences, such as enhancers, constitutive and induciblepromoters, and 5′ and 3′ untranslated regions in the vector and inpolynucleotide sequences encoding TRICH. Such elements may vary in theirstrength and specificity. Specific initiation signals may also be usedto achieve more efficient translation of sequences encoding TRICH. Suchsignals include the ATG initiation codon and adjacent sequences, e.g.the Kozak sequence. In cases where sequences encoding TRICH and itsinitiation codon and upstream regulatory sequences are inserted into theappropriate expression vector, no additional transcriptional ortranslational control signals may be needed. However, in cases whereonly coding sequence, or a fragment thereof, is inserted, exogenoustranslational control signals including an in-frame ATG initiation codonshould be provided by the vector. Exogenous translational elements andinitiation codons may be of various origins, both natural and synthetic.The efficiency of expression may be enhanced by the inclusion ofenhancers appropriate for the particular host cell system used. (See,e.g., Scharf, D. et al. (1994) Results Probl. Cell Differ. 20:125-162.)

[0201] Methods which are well known to those skilled in the art may beused to construct expression vectors containing sequences encoding TRICHand appropriate transcriptional and translational control elements.These methods include in vitro recombinant DNA techniques, synthetictechniques, and in vivo genetic recombination. (See, e.g., Sambrook, J.et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring HarborPress, Plainview N.Y., ch. 4, 8, and 16-17; Ausubel, F. M. et al. (1995)Current Protocols in Molecular Biology, John Wiley & Sons, New YorkN.Y., ch. 9, 13, and 16.)

[0202] A variety of expression vector/host systems may be utilized tocontain and express sequences encoding TRICH. These include, but are notlimited to, microorganisms such as bacteria transformed with recombinantbacteriophage, plasmid, or cosmid DNA expression vectors; yeasttransformed with yeast expression vectors; insect cell systems infectedwith viral expression vectors (e.g., baculovirus); plant cell systemstransformed with viral expression vectors (e.g., cauliflower mosaicvirus, CaMV, or tobacco mosaic virus, TMV) or with bacterial expressionvectors (e.g., Ti or pBR322 plasmids); or animal cell systems. (See,e.g., Sambrook, supra; Ausubel, supra; Van Heeke, G. and S. M. Schuster(1989) J. Biol. Chem. 264:5503-5509; Engelhard, E. K. et al. (1994)Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum.Gene Ther. 7:1937-1945; Takamatsu, N. (1987) EMBO J. 6:307-311; TheMcGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, NewYork N.Y., pp. 191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad.Sci. USA 81:3655-3659; and Harrington, J. J. et al. (1997) Nat. Genet.15:345-355.) Expression vectors derived from retroviruses, adenoviruses,or herpes or vaccinia viruses, or from various bacterial plasmids, maybe used for delivery of nucleotide sequences to the targeted organ,tissue, or cell population. (See, e.g., Di Nicola, M. et al. (1998)Cancer Gen. Ther. 5(6):350-356; Yu, M. et al. (1993) Proc. Natl. Acad.Sci. USA 90(13):6340-6344; Buller, R. M. et al. (1985) Nature317(6040):813-815; McGregor, D. P. et al. (1994) Mol. Immunol.31(3):219-226; and Verma, I. M. and N. Somia (1997) Nature 389:239-242.)The invention is not limited by the host cell employed.

[0203] In bacterial systems, a number of cloning and expression vectorsmay be selected depending upon the use intended for polynucleotidesequences encoding TRICH. For example, routine cloning, subcloning, andpropagation of polynucleotide sequences encoding TRICH can be achievedusing a multifunctional E. coli vector such as PBLUESCRIPT (Stratagene,La Jolla Calif.) or PSPORT1 plasmid (Life Technologies). Ligation ofsequences encoding TRICH into the vector's multiple cloning sitedisrupts the lacZ gene, allowing a colorimetric screening procedure foridentification of transformed bacteria containing recombinant molecules.In addition, these vectors may be useful for in vitro transcription,dideoxy sequencing, single strand rescue with helper phage, and creationof nested deletions in the cloned sequence. (See, e.g., Van Heeke, G.and S. M. Schuster (1989) J. Biol. Chem. 264:5503-5509.) When largequantities of TRICH are needed, e.g. for the production of antibodies,vectors which direct high level expression of TRICH may be used. Forexample, vectors containing the strong, inducible SP6 or T7bacteriophage promoter may be used.

[0204] Yeast expression systems may be used for production of TRICH. Anumber of vectors containing constitutive or inducible promoters, suchas alpha factor, alcohol oxidase, and PGH promoters, may be used in theyeast Saccharomyces cerevisiae or Pichia pastoris. In addition, suchvectors direct either the secretion or intracellular retention ofexpressed proteins and enable integration of foreign sequences into thehost genome for stable propagation. (See, e.g., Ausubel, 1995, supra;Bitter, G. A. et al. (1987) Methods Enzymol. 153:516-544; and Scorer, C.A. et al. (1994) Bio/Technology 12:181-184.)

[0205] Plant systems may also be used for expression of TRICH.Transcription of sequences encoding TRICH may be driven by viralpromoters, e.g., the 35S and 19S promoters of CaMV used alone or incombination with the omega leader sequence from TMV (Takamatsu, N.(1987) EMBO J. 6:307-311). Alternatively, plant promoters such as thesmall subunit of RUBISCO or heat shock promoters may be used. (See,e.g., Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; Broglie, R. et al.(1984) Science 224:838-843; and Winter, J. et al. (1991) Results Probl.Cell Differ. 17:85-105.) These constructs can be introduced into plantcells by direct DNA transformation or pathogen-mediated transfection.(See, e.g., The McGraw Hill Yearbook of Science and Technology (1992)McGraw Hill, New York N.Y., pp. 191-196.)

[0206] In mammalian cells, a number of viral-based expression systemsmay be utilized. In cases where an adenovirus is used as an expressionvector, sequences encoding TRICH may be ligated into an adenovirustranscription/translation complex consisting of the late promoter andtripartite leader sequence. Insertion in a non-essential E1 or E3 regionof the viral genome may be used to obtain infective virus whichexpresses TRICH in host cells. (See, e.g., Logan, J. and T. Shenk (1984)Proc. Natl. Acad. Sci. USA 81:3655-3659.) In addition, transcriptionenhancers, such as the Rous sarcoma virus (RSV) enhancer, may be used toincrease expression in mammalian host cells. SV40 or EBV-based vectorsmay also be used for high-level protein expression.

[0207] Human artificial chromosomes (HACs) may also be employed todeliver larger fragments of DNA than can be contained in and expressedfrom a plasmid. HACs of about 6 kb to 10 Mb are constructed anddelivered via conventional delivery methods (liposomes, polycationicamino polymers, or vesicles) for therapeutic purposes. (See, e.g.,Harrington, J. J. et al. (1997) Nat. Genet. 15:345-355.)

[0208] For long term production of recombinant proteins in mammaliansystems, stable expression of TRICH in cell lines is preferred. Forexample, sequences encoding TRICH can be transformed into cell linesusing expression vectors which may contain viral origins of replicationand/or endogenous expression elements and a selectable marker gene onthe same or on a separate vector. Following the introduction of thevector, cells may be allowed to grow for about 1 to 2 days in enrichedmedia before being switched to selective media. The purpose of theselectable marker is to confer resistance to a selective agent, and itspresence allows growth and recovery of cells which successfully expressthe introduced sequences. Resistant clones of stably transformed cellsmay be propagated using tissue culture techniques appropriate to thecell type.

[0209] Any number of selection systems may be used to recovertransformed cell lines. These include, but are not limited to, theherpes simplex virus thymidine kinase and adeninephosphoribosyltransferase genes, for use in tk⁻ and apr⁻ cells,respectively. (See, e.g., Wigler, M. et al. (1977) Cell 11:223-232;Lowy, I. et al. (1980) Cell 22:817-823.) Also, antimetabolite,antibiotic, or herbicide resistance can be used as the basis forselection. For example, dhfr confers resistance to methotrexate; neoconfers resistance to the aminoglycosides neomycin and G-418; and alsand pat confer resistance to chlorsulfuron and phosphinotricinacetyltransferase, respectively. (See, e.g., Wigler, M. et al. (1980)Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et al.(1981) J. Mol. Biol. 150:1-14.) Additional selectable genes have beendescribed, e.g., trpB and hisD, which alter cellular requirements formetabolites. (See, e.g., Hartman, S. C. and R. C. Mulligan (1988) Proc.Natl. Acad. Sci. USA 85:8047-8051.) Visible markers, e.g., anthocyanins,green fluorescent proteins (GFP; Clontech), β glucuronidase and itssubstrate β-glucuronide, or luciferase and its substrate luciferin maybe used. These markers can be used not only to identify transformants,but also to quantify the amount of transient or stable proteinexpression attributable to a specific vector system. (See, e.g., Rhodes,C. A. (1995) Methods Mol. Biol. 55:121-131.)

[0210] Although the presence/absence of marker gene expression suggeststhat the gene of interest is also present, the presence and expressionof the gene may need to be confirmed. For example, if the sequenceencoding TRICH is inserted within a marker gene sequence, transformedcells containing sequences encoding TRICH can be identified by theabsence of marker gene function. Alternatively, a marker gene can beplaced in tandem with a sequence encoding TRICH under the control of asingle promoter. Expression of the marker gene in response to inductionor selection usually indicates expression of the tandem gene as well.

[0211] In general, host cells that contain the nucleic acid sequenceencoding TRICH and that express TRICH may be identified by a variety ofprocedures known to those of skill in the art. These procedures include,but are not limited to, DNA-DNA or DNA-RNA hybridizations, PCRamplification, and protein bioassay or immunoassay techniques whichinclude membrane, solution, or chip based technologies for the detectionand/or quantification of nucleic acid or protein sequences.

[0212] Immunological methods for detecting and measuring the expressionof TRICH using either specific polyclonal or monoclonal antibodies areknown in the art. Examples of such techniques include enzyme-linkedimmunosorbent assays (ELISAs), radioimmunoassays (RIAs), andfluorescence activated cell sorting (FACS). A two-site, monoclonal-basedimmunoassay utilizing monoclonal antibodies reactive to twonon-interfering epitopes on TRICH is preferred, but a competitivebinding assay may be employed. These and other assays are well known inthe art. (See, e.g., Hampton, R. et al. (1990) Serological Methods, aLaboratory Manual, APS Press, St. Paul Minn., Sect. IV; Coligan, J. E.et al. (1997) Current Protocols in Immunology, Greene Pub. Associatesand Wiley-Interscience, New York N.Y.; and Pound, J. D. (1998)Immunochemical Protocols, Humana Press, Totowa N.J.)

[0213] A wide variety of labels and conjugation techniques are known bythose skilled in the art and may be used in various nucleic acid andamino acid assays. Means for producing labeled hybridization or PCRprobes for detecting sequences related to polynucleotides encoding TRICHinclude oligolabeling, nick translation, end-labeling, or PCRamplification using a labeled nucleotide. Alternatively, the sequencesencoding TRICH, or any fragments thereof, may be cloned into a vectorfor the production of an mRNA probe. Such vectors are known in the art,are commercially available, and may be used to synthesize RNA probes invitro by addition of an appropriate RNA polymerase such as T7, T3, orSP6 and labeled nucleotides. These procedures may be conducted using avariety of commercially available kits, such as those provided byAmersham Pharmacia Biotech, Promega (Madison Wis.), and US Biochemical.Suitable reporter molecules or labels which may be used for ease ofdetection include radionuclides, enzymes, fluorescent, chemiluminescent,or chromogenic agents, as well as substrates, cofactors, inhibitors,magnetic particles, and the like.

[0214] Host cells transformed with nucleotide sequences encoding TRICHmay be cultured under conditions suitable for the expression andrecovery of the protein from cell culture. The protein produced by atransformed cell may be secreted or retained intracellularly dependingon the sequence and/or the vector used. As will be understood by thoseof skill in the art, expression vectors containing polynucleotides whichencode TRICH may be designed to contain signal sequences which directsecretion of TRICH through a prokaryotic or eukaryotic cell membrane.

[0215] In addition, a host cell strain may be chosen for its ability tomodulate expression of the inserted sequences or to process theexpressed protein in the desired fashion. Such modifications of thepolypeptide include, but are not limited to, acetylation, carboxylation,glycosylation, phosphorylation lipidation, and acylation.Post-translational processing which cleaves a “prepro” or “pro” form ofthe protein may also be used to specify protein targeting, folding,and/or activity. Different host cells which have specific cellularmachinery and characteristic mechanisms for post-translationalactivities (e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available fromthe American Type Culture Collection (ATCC, Manassas Va.) and may bechosen to ensure the correct modification and processing of the foreignprotein.

[0216] In another embodiment of the invention, natural, modified, orrecombinant nucleic acid sequences encoding TRICH may be ligated to aheterologous sequence resulting in translation of a fusion protein inany of the aforementioned host systems. For example, a chimeric TRICHprotein containing a heterologous moiety that can be recognized by acommercially available antibody may facilitate the screening of peptidelibraries for inhibitors of TRICH activity. Heterologous protein andpeptide moieties may also facilitate purification of fusion proteinsusing commercially available affinity matrices. Such moieties include,but are not limited to, glutathione S-transferase (GST), maltose bindingprotein (MBP), thioredoxin (Trx), calmodulin binding peptide (CBP),6-His, FLAG, c-myc, and hemagglutinin (HA). GST, MBP, Trx, CBP, and6-His enable purification of their cognate fusion proteins onimmobilized glutathione, maltose, phenylarsine oxide, calmodulin, andmetal-chelate resins, respectively. FLAG, c-myc, and hemagglutinin (HA)enable immunoaffinity purification of fusion proteins using commerciallyavailable monoclonal and polyclonal antibodies that specificallyrecognize these epitope tags. A fusion protein may also be engineered tocontain a proteolytic cleavage site located between the TRICH encodingsequence and the heterologous protein sequence, so that TRICH may becleaved away from the heterologous moiety following purification.Methods for fusion protein expression and purification are discussed inAusubel (1995, supra, ch. 10). A variety of commercially available kitsmay also be used to facilitate expression and purification of fusionproteins.

[0217] In a further embodiment of the invention, synthesis ofradiolabeled TRICH may be achieved in vitro using the TNT rabbitreticulocyte lysate or wheat germ extract system (Promega). Thesesystems couple transcription and translation of protein-coding sequencesoperably associated with the T7, T3, or SP6 promoters. Translation takesplace in the presence of a radiolabeled amino acid precursor, forexample, ³⁵S-methionine.

[0218] TRICH of the present invention or fragments thereof may be usedto screen for compounds that specifically bind to TRICH. At least oneand up to a plurality of test compounds may be screened for specificbinding to TRICH. Examples of test compounds include antibodies,oligonucleotides, proteins (e.g., receptors), or small molecules.

[0219] In one embodiment, the compound thus identified is closelyrelated to the natural ligand of TRICH, e.g., a ligand or fragmentthereof, a natural substrate, a structural or functional mimetic, or anatural binding partner. (See, e.g., Coligan, J. E. et al. (1991)Current Protocols in Immunology 1(2): Chapter 5.) Similarly, thecompound can be closely related to the natural receptor to which TRICHbinds, or to at least a fragment of the receptor, e.g., the ligandbinding site. In either case, the compound can be rationally designedusing known techniques. In one embodiment, screening for these compoundsinvolves producing appropriate cells which express TRICH, either as asecreted protein or on the cell membrane. Preferred cells include cellsfrom mammals, yeast, Drosophila, or E. coli. Cells expressing TRICH orcell membrane fractions which contain TRICH are then contacted with atest compound and binding, stimulation, or inhibition of activity ofeither TRICH or the compound is analyzed.

[0220] An assay may simply test binding of a test compound to thepolypeptide, wherein binding is detected by a fluorophore, radioisotope,enzyme conjugate, or other detectable label. For example, the assay maycomprise the steps of combining at least one test compound with TRICH,either in solution or affixed to a solid support, and detecting thebinding of TRICH to the compound. Alternatively, the assay may detect ormeasure binding of a test compound in the presence of a labeledcompetitor. Additionally, the assay may be carried out using cell-freepreparations, chemical libraries, or natural product mixtures, and thetest compound(s) may be free in solution or affixed to a solid support.

[0221] TRICH of the present invention or fragments thereof may be usedto screen for compounds that modulate the activity of TRICH. Suchcompounds may include agonists, antagonists, or partial or inverseagonists. In one embodiment, an assay is performed under conditionspermissive for TRICH activity, wherein TRICH is combined with at leastone test compound, and the activity of TRICH in the presence of a testcompound is compared with the activity of TRICH in the absence of thetest compound. A change in the activity of TRICH in the presence of thetest compound is indicative of a compound that modulates the activity ofTRICH. Alternatively, a test compound is combined with an in vitro orcell-free system comprising TRICH under conditions suitable for TRICHactivity, and the assay is performed. In either of these assays, a testcompound which modulates the activity of TRICH may do so indirectly andneed not come in direct contact with the test compound. At least one andup to a plurality of test compounds may be screened.

[0222] In another embodiment, polynucleotides encoding TRICH or theirmammalian homologs may be “knocked out” in an animal model system usinghomologous recombination in embryonic stem (ES) cells. Such techniquesare well known in the art and are useful for the generation of animalmodels of human disease. (See, e.g., U.S. Pat. No. 5,175,383 and U.S.Pat. No. 5,767,337.) For example, mouse ES cells, such as the mouse129/SvJ cell line, are derived from the early mouse embryo and grown inculture. The ES cells are transformed with a vector containing the geneof interest disrupted by a marker gene, e.g., the neomycinphosphotransferase gene (neo; Capecchi, M. R. (1989) Science244:1288-1292). The vector integrates into the corresponding region ofthe host genome by homologous recombination. Alternatively, homologousrecombination takes place using the Cre-loxP system to knockout a geneof interest in a tissue- or developmental stage-specific manner (Marth,J. D. (1996) Clin. Invest. 97:1999-2002; Wagner, K. U. et al. (1997)Nucleic Acids Res. 25:4323-4330). Transformed ES cells are identifiedand microinjected into mouse cell blastocysts such as those from theC57BL/6 mouse strain. The blastocysts are surgically transferred topseudopregnant dams, and the resulting chimeric progeny are genotypedand bred to produce heterozygous or homozygous strains. Transgenicanimals thus generated may be tested with potential therapeutic or toxicagents.

[0223] Polynucleotides encoding TRICH may also be manipulated in vitroin ES cells derived from human blastocysts. Human ES cells have thepotential to differentiate into at least eight separate cell lineagesincluding endoderm, mesoderm, and ectodermal cell types. These celllineages differentiate into, for example, neural cells, hematopoieticlineages, and cardiomyocytes (Thomson, J. A. et al. (1998) Science282:1145-1147).

[0224] Polynucleotides encoding TRICH can also be used to create“knockin” humanized animals (pigs) or transgenic animals (mice or rats)to model human disease. With knockin technology, a region of apolynucleotide encoding TRICH is injected into animal ES cells, and theinjected sequence integrates into the animal cell genome. Transformedcells are injected into blastulae, and the blastulae are implanted asdescribed above. Transgenic progeny or inbred lines are studied andtreated with potential pharmaceutical agents to obtain information ontreatment of a human disease. Alternatively, a mammal inbred tooverexpress TRICH, e.g., by secreting TRICH in its milk, may also serveas a convenient source of that protein (Janne, J. et al. (1998)Biotechnol. Annu. Rev. 4:55-74).

THERAPEUTICS

[0225] Chemical and structural similarity, e.g., in the context ofsequences and motifs, exists between regions of TRICH and transportersand ion channels. In addition, examples of tissues expressing TRICH areprimary human breast epithelial cells and also can be found in Table 6.Therefore, TRICH appears to play a role in transport, neurological,muscle, immunological and cell proliferative disorders. In the treatmentof disorders associated with increased TRICH expression or activity, itis desirable to decrease the expression or activity of TRICH. In thetreatment of disorders associated with decreased TRICH expression oractivity, it is desirable to increase the expression or activity ofTRICH.

[0226] Therefore, in one embodiment, TRICH or a fragment or derivativethereof may be administered to a subject to treat or prevent a disorderassociated with decreased expression or activity of TRICH. Examples ofsuch disorders include, but are not limited to, a transport disordersuch as akinesia, amyotrophic lateral sclerosis, ataxia telangiectasia,cystic fibrosis, Becker's muscular dystrophy, Bell's palsy,Charcot-Marie Tooth disease, diabetes mellitus, diabetes insipidus,diabetic neuropathy, Duchenne muscular dystrophy, hyperkalemic periodicparalysis, normokalemic periodic paralysis, Parkinson's disease,malignant hyperthermia, multidrug resistance, myasthenia gravis,myotonic dystrophy, catatonia, tardive dyskinesia, dystonias, peripheralneuropathy, cerebral neoplasms, prostate cancer, cardiac disordersassociated with transport, e.g., angina, bradyarrythmia, tachyarrythmia,hypertension, Long QT syndrome, myocarditis, cardiomyopathy, nemalinemyopathy, centronuclear myopathy, lipid myopathy, mitochondrialmyopathy, thyrotoxic myopathy, ethanol myopathy, dermatomyositis,inclusion body myositis, infectious myositis, polymyositis, neurologicaldisorders associated with transport, e.g., Alzheimer's disease, amnesia,bipolar disorder, dementia, depression, epilepsy, Tourette's disorder,paranoid psychoses, and schizophrenia, and other disorders associatedwith transport, e.g., neurofibromatosis, postherpetic neuralgia,trigeminal neuropathy, sarcoidosis, sickle cell anemia, Wilson'sdisease, cataracts, infertility, pulmonary artery stenosis,sensorineural autosomal deafness, hyperglycemia, hypoglycemia, Grave'sdisease, goiter, Cushing's disease, Addison's disease, glucose-galactosemalabsorption syndrome, glycogen storage disease, hypercholesterolemia,adrenoleukodystrophy, Zellweger syndrome, Menkes disease, occipital hornsyndrome, von Gierke disease, pseudohypoaldosteronism type 1, Liddle'ssyndrome, cystinuria, iminoglycinuria, Hartup disease, Fanconi disease,and Bartter syndrome; a neurological disorder such as epilepsy, ischemiccerebrovascular disease, stroke, cerebral neoplasms, Alzheimer'sdisease, Pick's disease, Huntington's disease, dementia, Parkinson'sdisease and other extrapyramidal disorders, amyotrophic lateralsclerosis and other motor neuron disorders, progressive neural muscularatrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosisand other demyelinating diseases, bacterial and viral meningitis, brainabscess, subdural empyema, epidural abscess, suppurative intracranialthrombophlebitis, myelitis and radiculitis, viral central nervous systemdisease, prion diseases including kuru, Creutzfeldt-Jakob disease, andGerstmann-Straussler-Scheinker syndrome, fatal familial insomnia,nutritional and metabolic diseases of the nervous system,neurofibromatosis, tuberous sclerosis, cerebelloretinalhemangioblastomatosis, encephalotrigeminal syndrome, mental retardationand other developmental disorders of the central nervous systemincluding Down syndrome, cerebral palsy, neuroskeletal disorders,autonomic nervous system disorders, cranial nerve disorders, spinal corddiseases, muscular dystrophy and other neuromuscular disorders,peripheral nervous system disorders, dermatomyositis and polymyositis,inherited, metabolic, endocrine, and toxic myopathies, myastheniagravis, periodic paralysis, mental disorders including mood, anxiety,and schizophrenic disorders, seasonal affective disorder (SAD),akathesia, amnesia, catatonia, diabetic neuropathy, hemiplegic migraine,tardive dyskinesia, dystonias, paranoid psychoses, postherpeticneuralgia, Tourette's disorder, progressive supranuclear palsy,corticobasal degeneration, and familial frontotemporal dementia; amuscle disorder such as cardiomyopathy, myocarditis, Duchenne's musculardystrophy, Becker's muscular dystrophy, myotonic dystrophy, central coredisease, nemaline myopathy, centronuclear myopathy, lipid myopathy,mitochondrial myopathy, infectious myositis, polymyositis,dermatomyositis, inclusion body myositis, thyrotoxic myopathy, ethanolmyopathy, angina, anaphylactic shock, arrhythmias, asthma,cardiovascular shock, Cushing's syndrome, hypertension, hypoglycemia,myocardial infarction, migraine, pheochromocytoma, and myopathiesincluding encephalopathy, epilepsy, Kearns-Sayre syndrome, lacticacidosis, myoclonic disorder, ophthalmoplegia, acid maltase deficiency(AMD, also known as Pompe's disease), generalized myotonia, and myotoniacongenita; an immunological disorder such as acquired immunodeficiencysyndrome (AIDS), Addison's disease, adult respiratory distress syndrome,allergies, ankylosing spondylitis, amyloidosis, anemia, asthma,atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis,autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED),bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopicdermatitis, dermatomyositis, diabetes mellitus, emphysema, episodiclymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythemanodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome,gout, Graves' disease, Hashimoto's thyroiditis, hypereosinophilia,irritable bowel syndrome, multiple sclerosis, myasthenia gravis,myocardial or pericardial inflammation, osteoarthritis, osteoporosis,pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoidarthritis, scleroderma, Sjögren's syndrome, systemic anaphylaxis,systemic lupus erythematosus, systemic sclerosis, thrombocytopenicpurpura, ulcerative colitis, uveitis, Werner syndrome, complications ofcancer, hemodialysis, and extracorporeal circulation, viral, bacterial,fungal, parasitic, protozoal, and helminthic infections, and trauma; anda cell proliferative disorder such as actinic keratosis,arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixedconnective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnalhemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia,and cancers including adenocarcinoma, leukemia, lymphoma, melanoma,myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of theadrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gallbladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung,muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands,skin, spleen, testis, thymus, thyroid, and uterus.

[0227] In another embodiment, a vector capable of expressing TRICH or afragment or derivative thereof may be administered to a subject to treator prevent a disorder associated with decreased expression or activityof TRICH including, but not limited to, those described above.

[0228] In a further embodiment, a composition comprising a substantiallypurified TRICH in conjunction with a suitable pharmaceutical carrier maybe administered to a subject to treat or prevent a disorder associatedwith decreased expression or activity of TRICH including, but notlimited to, those provided above.

[0229] In still another embodiment, an agonist which modulates theactivity of TRICH may be administered to a subject to treat or prevent adisorder associated with decreased expression or activity of TRICHincluding, but not limited to, those listed above.

[0230] In a further embodiment, an antagonist of TRICH may beadministered to a subject to treat or prevent a disorder associated withincreased expression or activity of TRICH. Examples of such disordersinclude, but are not limited to, those transport, neurological, muscle,immunological and cell proliferative disorders described above. In oneaspect, an antibody which specifically binds TRICH may be used directlyas an antagonist or indirectly as a targeting or delivery mechanism forbringing a pharmaceutical agent to cells or tissues which express TRICH.

[0231] In an additional embodiment, a vector expressing the complementof the polynucleotide encoding TRICH may be administered to a subject totreat or prevent a disorder associated with increased expression oractivity of TRICH including, but not limited to, those described above.

[0232] In other embodiments, any of the proteins, antagonists,antibodies, agonists, complementary sequences, or vectors of theinvention may be administered in combination with other appropriatetherapeutic agents. Selection of the appropriate agents for use incombination therapy may be made by one of ordinary skill in the art,according to conventional pharmaceutical principles. The combination oftherapeutic agents may act synergistically to effect the treatment orprevention of the various disorders described above. Using thisapproach, one may be able to achieve therapeutic efficacy with lowerdosages of each agent, thus reducing the potential for adverse sideeffects.

[0233] An antagonist of TRICH may be produced using methods which aregenerally known in the art. In particular, purified TRICH may be used toproduce antibodies or to screen libraries of pharmaceutical agents toidentify those which specifically bind TRICH. Antibodies to TRICH mayalso be generated using methods that are well known in the art. Suchantibodies may include, but are not limited to, polyclonal, monoclonal,chimeric, and single chain antibodies, Fab fragments, and fragmentsproduced by a Fab expression library. Neutralizing antibodies (i.e.,those which inhibit dimer formation) are generally preferred fortherapeutic use. Single chain antibodies (e.g., from camels or llamas)may be potent enzyme inhibitors and may have advantages in the design ofpeptide mimetics, and in the development of immuno-adsorbents andbiosensors (Muyldermans, S. (2001) J. Biotechnol. 74:277-302).

[0234] For the production of antibodies, various hosts including goats,rabbits, rats, mice, camels, dromedaries, llamas, humans, and others maybe immunized by injection with TRICH or with any fragment oroligopeptide thereof which has immunogenic properties. Depending on thehost species, various adjuvants may be used to increase immunologicalresponse. Such adjuvants include, but are not limited to, Freund's,mineral gels such as aluminum hydroxide, and surface active substancessuch as lysolecithin, pluronic polyols, polyanions, peptides, oilemulsions, KLH, and dinitrophenol. Among adjuvants used in humans, BCG(bacilli Calmette-Guerin) and Corynebacterium parvum are especiallypreferable.

[0235] It is preferred that the oligopeptides, peptides, or fragmentsused to induce antibodies to TRICH have an amino acid sequenceconsisting of at least about 5 amino acids, and generally will consistof at least about 10 amino acids. It is also preferable that theseoligopeptides, peptides, or fragments are identical to a portion of theamino acid sequence of the natural protein. Short stretches of TRICHamino acids may be fused with those of another protein, such as KLH, andantibodies to the chimeric molecule may be produced.

[0236] Monoclonal antibodies to TRICH may be prepared using anytechnique which provides for the production of antibody molecules bycontinuous cell lines in culture. These include, but are not limited to,the hybridoma technique, the human B-cell hybridoma technique, and theEBV-hybridoma technique. (See, e.g., Kohler, G. et al. (1975) Nature256:495-497; Kozbor, D. et al. (1985) J. Immunol. Methods 81:31-42;Cote, R. J. et al. (1983) Proc. Natl. Acad. Sci. USA 80:2026-2030; andCole, S. P. et al. (1984) Mol. Cell Biol. 62:109-120.)

[0237] In addition, techniques developed for the production of “chimericantibodies,” such as the splicing of mouse antibody genes to humanantibody genes to obtain a molecule with appropriate antigen specificityand biological activity, can be used. (See, e.g., Morrison, S. L. et al.(1984) Proc. Natl. Acad. Sci. USA 81:6851-6855; Neuberger, M. S. et al.(1984) Nature 312:604-608; and Takeda, S. et al. (1985) Nature314:452-454.) Alternatively, techniques described for the production ofsingle chain antibodies may be adapted, using methods known in the art,to produce TRICH-specific single chain antibodies. Antibodies withrelated specificity, but of distinct idiotypic composition, may begenerated by chain shuffling from random combinatorial immunoglobulinlibraries. (See, e.g., Burton, D. R. (1991) Proc. Natl. Acad. Sci. USA88:10134-10137.)

[0238] Antibodies may also be produced by inducing in vivo production inthe lymphocyte population or by screening immunoglobulin libraries orpanels of highly specific binding reagents as disclosed in theliterature. (See, e.g., Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci.USA 86:3833-3837; Winter, G. et al. (1991) Nature 349:293-299.)

[0239] Antibody fragments which contain specific binding sites for TRICHmay also be generated. For example, such fragments include, but are notlimited to, F(ab′)₂ fragments produced by pepsin digestion of theantibody molecule and Fab fragments generated by reducing the disulfidebridges of the F(ab′)2 fragments. Alternatively, Fab expressionlibraries may be constructed to allow rapid and easy identification ofmonoclonal Fab fragments with the desired specificity. (See, e.g., Huse,W. D. et al. (1989) Science 246:1275-1281.)

[0240] Various immunoassays may be used for screening to identifyantibodies having the desired specificity. Numerous protocols forcompetitive binding or immunoradiometric assays using either polyclonalor monoclonal antibodies with established specificities are well knownin the art. Such immunoassays typically involve the measurement ofcomplex formation between TRICH and its specific antibody. A two-site,monoclonal-based immunoassay utilizing monoclonal antibodies reactive totwo non-interfering TRICH epitopes is generally used, but a competitivebinding assay may also be employed (Pound, supra).

[0241] Various methods such as Scatchard analysis in conjunction withradioimmunoassay techniques may be used to assess the affinity ofantibodies for TRICH. Affinity is expressed as an association constant,K_(a), which is defined as the molar concentration of TRICH-antibodycomplex divided by the molar concentrations of free antigen and freeantibody under equilibrium conditions. The K_(a) determined for apreparation of polyclonal antibodies, which are heterogeneous in theiraffinities for multiple TRICH epitopes, represents the average affinity,or avidity, of the antibodies for TRICH. The K_(a) determined for apreparation of monoclonal antibodies, which are monospecific for aparticular TRICH epitope, represents a true measure of affinity.High-affinity antibody preparations with K_(a) ranging from about 10⁹ to10¹² L/mole are preferred for use in immunoassays in which theTRICH-antibody complex must withstand rigorous manipulations.Low-affinity antibody preparations with K_(a) ranging from about 10⁶ to10⁷ L/mole are preferred for use in immunopurification and similarprocedures which ultimately require dissociation of TRICH, preferably inactive form, from the antibody (Catty, D. (1988) Antibodies, Volume I: APractical Approach, IRL Press, Washington D.C.; Liddell, J. E. and A.Cryer (1991) A Practical Guide to Monoclonal Antibodies, John Wiley &Sons, New York N.Y.).

[0242] The titer and avidity of polyclonal antibody preparations may befurther evaluated to determine the quality and suitability of suchpreparations for certain downstream applications. For example, apolyclonal antibody preparation containing at least 1-2 mg specificantibody/ml, preferably 5-10 mg specific antibody/ml, is generallyemployed in procedures requiring precipitation of TRICH-antibodycomplexes. Procedures for evaluating antibody specificity, titer, andavidity, and guidelines for antibody quality and usage in variousapplications, are generally available. (See, e.g., Catty, supra, andColigan et al. supra.)

[0243] In another embodiment of the invention, the polynucleotidesencoding TRICH, or any fragment or complement thereof, may be used fortherapeutic purposes. In one aspect, modifications of gene expressioncan be achieved by designing complementary sequences or antisensemolecules (DNA, RNA, PNA, or modified oligonucleotides) to the coding orregulatory regions of the gene encoding TRICH. Such technology is wellknown in the art, and antisense oligonucleotides or larger fragments canbe designed from various locations along the coding or control regionsof sequences encoding TRICH. (See, e.g., Agrawal, S., ed. (1996)Antisense Therapeutics, Humana Press Inc., Totawa N.J.)

[0244] In therapeutic use, any gene delivery system suitable forintroduction of the antisense sequences into appropriate target cellscan be used. Antisense sequences can be delivered intracellularly in theform of an expression plasmid which, upon transcription, produces asequence complementary to at least a portion of the cellular sequenceencoding the target protein. (See, e.g., Slater, J. E. et al. (1998) J.Allergy Clin. Immunol. 102(3):469-475; and Scanlon, K. J. et al. (1995)9(13):1288-1296.) Antisense sequences can also be introducedintracellularly through the use of viral vectors, such as retrovirus andadeno-associated virus vectors. (See, e.g., Miller, A. D. (1990) Blood76:271; Ausubel, supra; Uckert, W. and W. Walther (1994) Pharmacol.Ther. 63(3):323-347.) Other gene delivery mechanisms includeliposome-derived systems, artificial viral envelopes, and other systemsknown in the art. (See, e.g., Rossi, J. J. (1995) Br. Med. Bull.51(1):217-225; Boado, R. J. et al. (1998) J. Pharm. Sci.87(11):1308-1315; and Morris, M. C. et al. (1997) Nucleic Acids Res.25(14):2730-2736.)

[0245] In another embodiment of the invention, polynucleotides encodingTRICH may be used for somatic or germline gene therapy. Gene therapy maybe performed to (i) correct a genetic deficiency (e.g., in the cases ofsevere combined immunodeficiency (SCID)-X1 disease characterized byX-linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science288:669-672), severe combined immunodeficiency syndrome associated withan inherited adenosine deaminase (ADA) deficiency (Blaese, R. M. et al.(1995) Science 270:475-480; Bordignon, C. et al. (1995) Science270:470-475), cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216;Crystal, R. G. et al. (1995) Hum. Gene Therapy 6:643-666; Crystal, R. G.et al. (1995) Hum. Gene Therapy 6:667-703), thalassamias, familialhypercholesterolemia, and hemophilia resulting from Factor VIII orFactor IX deficiencies (Crystal, R. G. (1995) Science 270:404-410;Verma, I. M. and N. Somia (1997) Nature 389:239-242)), (ii) express aconditionally lethal gene product (e.g., in the case of cancers whichresult from unregulated cell proliferation), or (iii) express a proteinwhich affords protection against intracellular parasites (e.g., againsthuman retroviruses, such as human immunodeficiency virus (HIV)(Baltimore, D. (1988) Nature 335:395-396; Poeschla, E. et al. (1996)Proc. Natl. Acad. Sci. USA 93:11395-11399), hepatitis B or C virus (HBV,HCV); fungal parasites, such as Candida albicans and Paracoccidioidesbrasiliensis; and protozoan parasites such as Plasmodium falciparum andTrypanosoma cruzi). In the case where a genetic deficiency in TRICHexpression or regulation causes disease, the expression of TRICH from anappropriate population of transduced cells may alleviate the clinicalmanifestations caused by the genetic deficiency.

[0246] In a further embodiment of the invention, diseases or disorderscaused by deficiencies in TRICH are treated by constructing mammalianexpression vectors encoding TRICH and introducing these vectors bymechanical means into TRICH-deficient cells. Mechanical transfertechnologies for use with cells in vivo or ex vitro include (i) directDNA microinjection into individual cells, (ii) ballistic gold particledelivery, (iii) liposome-mediated transfection, (iv) receptor-mediatedgene transfer, and (v) the use of DNA transposons (Morgan, R. A. and W.F. Anderson (1993) Annu. Rev. Biochem. 62:191-217; Ivics, Z. (1997) Cell91:501-510; Boulay, J-L. and H. Recipon (1998) Curr. Opin. Biotechnol.9:445-450).

[0247] Expression vectors that may be effective for the expression ofTRICH include, but are not limited to, the PCDNA 3.1, EPITAG, PRCCMV2,PREP, PVAX, PCR2-TOPOTA vectors (Invitrogen, Carlsbad Calif.),PCMV-SCRIPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla Calif.), andPTET-OFF, PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo AltoCalif.). TRICH may be expressed using (i) a constitutively activepromoter, (e.g., from cytomegalovirus (CMV), Rous sarcoma virus (RSV),SV40 virus, thymidine kinase (TK), or β-actin genes), (ii) an induciblepromoter (e.g., the tetracycline-regulated promoter (Gossen, M. and H.Bujard (1992) Proc. Natl. Acad. Sci. USA 89:5547-5551; Gossen, M. et al.(1995) Science 268:1766-1769; Rossi, F. M. V. and H. M. Blau (1998)Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REXplasmid (Invitrogen)); the ecdysone-inducible promoter (available in theplasmids PVGRXR and PIND; Invitrogen); the FK506/rapamycin induciblepromoter; or the RU486/mifepristone inducible promoter (Rossi, F. M. V.and H. M. Blau, supra), or (iii) a tissue-specific promoter or thenative promoter of the endogenous gene encoding TRICH from a normalindividual.

[0248] Commercially available liposome transformation kits (e.g., thePERFECT LIPID TRANSFECTION KIT, available from Invitrogen) allow onewith ordinary skill in the art to deliver polynucleotides to targetcells in culture and require minimal effort to optimize experimentalparameters. In the alternative, transformation is performed using thecalcium phosphate method (Graham, F. L. and A. J. Eb (1973) Virology52:456-467), or by electroporation (Neumann, E. et al. (1982) EMBO J.1:841-845). The introduction of DNA to primary cells requiresmodification of these standardized mammalian transfection protocols.

[0249] In another embodiment of the invention, diseases or disorderscaused by genetic defects with respect to TRICH expression are treatedby constructing a retrovirus vector consisting of (i) the polynucleotideencoding TRICH under the control of an independent promoter or theretrovirus long terminal repeat (LTR) promoter, (ii) appropriate RNApackaging signals, and (iii) a Rev-responsive element (RRE) along withadditional retrovirus cis-acting RNA sequences and coding sequencesrequired for efficient vector propagation. Retrovirus vectors (e.g., PFBand PFBNEO) are commercially available (Stratagene) and are based onpublished data (Riviere, I. et al. (1995) Proc. Natl. Acad. Sci. USA92:6733-6737), incorporated by reference herein. The vector ispropagated in an appropriate vector producing cell line (VPCL) thatexpresses an envelope gene with a tropism for receptors on the targetcells or a promiscuous envelope protein such as VSVg (Armentano, D. etal. (1987) J. Virol. 61:1647-1650; Bender, M. A. et al. (1987) J. Virol.61:1639-1646; Adam, M. A. and A. D. Miller (1988) J. Virol.62:3802-3806; Dull, T. et al. (1998) J. Virol. 72:8463-8471; Zufferey,R. et al. (1998) J. Virol. 72:9873-9880). U.S. Pat. No. 5,910,434 toRigg (“Method for obtaining retrovirus packaging cell lines producinghigh transducing efficiency retroviral supernatant”) discloses a methodfor obtaining retrovirus packaging cell lines and is hereby incorporatedby reference. Propagation of retrovirus vectors, transduction of apopulation of cells (e.g., CD4⁺ T-cells), and the return of transducedcells to a patient are procedures well known to persons skilled in theart of gene therapy and have been well documented (Ranga, U. et al.(1997) J. Virol. 71:7020-7029; Bauer, G. et al. (1997) Blood89:2259-2267; Bonyhadi, M. L. (1997) J. Virol. 71:4707-4716; Ranga, U.et al. (1998) Proc. Natl. Acad. Sci. USA 95:1201-1206; Su, L. (1997)Blood 89:2283-2290).

[0250] In the alternative, an adenovirus-based gene therapy deliverysystem is used to deliver polynucleotides encoding TRICH to cells whichhave one or more genetic abnormalities with respect to the expression ofTRICH. The construction and packaging of adenovirus-based vectors arewell known to those with ordinary skill in the art. Replicationdefective adenovirus vectors have proven to be versatile for importinggenes encoding immunoregulatory proteins into intact islets in thepancreas (Csete, M. E. et al. (1995) Transplantation 27:263-268).Potentially useful adenoviral vectors are described in U.S. Pat. No.5,707,618 to Armentano (“Adenovirus vectors for gene therapy”), herebyincorporated by reference. For adenoviral vectors, see also Antinozzi,P. A. et al. (1999) Annu. Rev. Nutr. 19:511-544 and Verma, I. M. and N.Somia (1997) Nature 18:389:239-242, both incorporated by referenceherein.

[0251] In another alternative, a herpes-based, gene therapy deliverysystem is used to deliver polynucleotides encoding TRICH to target cellswhich have one or more genetic abnormalities with respect to theexpression of TRICH. The use of herpes simplex virus (HSV)-based vectorsmay be especially valuable for introducing TRICH to cells of the centralnervous system, for which HSV has a tropism. The construction andpackaging of herpes-based vectors are well known to those with ordinaryskill in the art. A replication-competent herpes simplex virus (HSV)type 1-based vector has been used to deliver a reporter gene to the eyesof primates (Liu, X. et al. (1999) Exp. Eye Res. 169:385-395). Theconstruction of a HSV-1 virus vector has also been disclosed in detailin U.S. Pat. No. 5,804,413 to DeLuca (“Herpes simplex virus strains forgene transfer”), which is hereby incorporated by reference. U.S. Pat.No. 5,804,413 teaches the use of recombinant HSV d92 which consists of agenome containing at least one exogenous gene to be transferred to acell under the control of the appropriate promoter for purposesincluding human gene therapy. Also taught by this patent are theconstruction and use of recombinant HSV strains deleted for ICP4, ICP27and ICP22. For HSV vectors, see also Goins, W. F. et al. (1999) J.Virol. 73:519-532 and Xu, H. et al. (1994) Dev. Biol. 163:152-161,hereby incorporated by reference. The manipulation of cloned herpesvirussequences, the generation of recombinant virus following thetransfection of multiple plasmids containing different segments of thelarge herpesvirus genomes, the growth and propagation of herpesvirus,and the infection of cells with herpesvirus are techniques well known tothose of ordinary skill in the art.

[0252] In another alternative, an alphavirus (positive, single-strandedRNA virus) vector is used to deliver polynucleotides encoding TRICH totarget cells. The biology of the prototypic alphavirus, Semliki ForestVirus (SFV), has been studied extensively and gene transfer vectors havebeen based on the SFV genome (Garoff, H. and K.-J. Li (1998) Curr. Opin.Biotechnol. 9:464-469). During alphavirus RNA replication, a subgenomicRNA is generated that normally encodes the viral capsid proteins. Thissubgenomic RNA replicates to higher levels than the full length genomicRNA, resulting in the overproduction of capsid proteins relative to theviral proteins with enzymatic activity (e.g., protease and polymerase).Similarly, inserting the coding sequence for TRICH into the alphavirusgenome in place of the capsid-coding region results in the production ofa large number of TRICH-coding RNAs and the synthesis of high levels ofTRICH in vector transduced cells. While alphavirus infection istypically associated with cell lysis within a few days, the ability toestablish a persistent infection in hamster normal kidney cells (BHK-21)with a variant of Sindbis virus (SIN) indicates that the lyticreplication of alphaviruses can be altered to suit the needs of the genetherapy application (Dryga, S. A. et al. (1997) Virology 228:74-83). Thewide host range of alphaviruses will allow the introduction of TRICHinto a variety of cell types. The specific transduction of a subset ofcells in a population may require the sorting of cells prior totransduction. The methods of manipulating infectious cDNA clones ofalphaviruses, performing alphavirus cDNA and RNA transfections, andperforming alphavirus infections, are well known to those with ordinaryskill in the art.

[0253] Oligonucleotides derived from the transcription initiation site,e.g., between about positions −10 and +10 from the start site, may alsobe employed to inhibit gene expression. Similarly, inhibition can beachieved using triple helix base-pairing methodology. Triple helixpairing is useful because it causes inhibition of the ability of thedouble helix to open sufficiently for the binding of polymerases,transcription factors, or regulatory molecules. Recent therapeuticadvances using triplex DNA have been described in the literature. (See,e.g., Gee, J. E. et al. (1994) in Huber, B. E. and B. I. Carr, Molecularand Immunologic Approaches, Futura Publishing, Mt. Kisco N.Y., pp.163-177.) A complementary sequence or antisense molecule may also bedesigned to block translation of mRNA by preventing the transcript frombinding to ribosomes.

[0254] Ribozymes, enzymatic RNA molecules, may also be used to catalyzethe specific cleavage of RNA. The mechanism of ribozyme action involvessequence-specific hybridization of the ribozyme molecule tocomplementary target RNA, followed by endonucleolytic cleavage. Forexample, engineered hammerhead motif ribozyme molecules may specificallyand efficiently catalyze endonucleolytic cleavage of sequences encodingTRICH.

[0255] Specific ribozyme cleavage sites within any potential RNA targetare initially identified by scanning the target molecule for ribozymecleavage sites, including the following sequences: GUA, GUU, and GUC.Once identified, short RNA sequences of between 15 and 20ribonucleotides, corresponding to the region of the target genecontaining the cleavage site, may be evaluated for secondary structuralfeatures which may render the oligonucleotide inoperable. Thesuitability of candidate targets may also be evaluated by testingaccessibility to hybridization with complementary oligonucleotides usingribonuclease protection assays.

[0256] Complementary ribonucleic acid molecules and ribozymes of theinvention may be prepared by any method known in the art for thesynthesis of nucleic acid molecules. These include techniques forchemically synthesizing oligonucleotides such as solid phasephosphoramidite chemical synthesis. Alternatively, RNA molecules may begenerated by in vitro and in vivo transcription of DNA sequencesencoding TRICH. Such DNA sequences may be incorporated into a widevariety of vectors with suitable RNA polymerase promoters such as T7 orSP6. Alternatively, these cDNA constructs that synthesize complementaryRNA, constitutively or inducibly, can be introduced into cell lines,cells, or tissues.

[0257] RNA molecules may be modified to increase intracellular stabilityand half-life. Possible modifications include, but are not limited to,the addition of flanking sequences at the 5′ and/or 3′ ends of themolecule, or the use of phosphorothioate or 2′ O-methyl rather thanphosphodiesterase linkages within the backbone of the molecule. Thisconcept is inherent in the production of PNAs and can be extended in allof these molecules by the inclusion of nontraditional bases such asinosine, queosine, and wybutosine, as well as acetyl-, methyl-, thio-,and similarly modified forms of adenine, cytidine, guanine, thymine, anduridine which are not as easily recognized by endogenous endonucleases.

[0258] An additional embodiment of the invention encompasses a methodfor screening for a compound which is effective in altering expressionof a polynucleotide encoding TRICH. Compounds which may be effective inaltering expression of a specific polynucleotide may include, but arenot limited to, oligonucleotides, antisense oligonucleotides, triplehelix-forming oligonucleotides, transcription factors and otherpolypeptide transcriptional regulators, and non-macromolecular chemicalentities which are capable of interacting with specific polynucleotidesequences. Effective compounds may alter polynucleotide expression byacting as either inhibitors or promoters of polynucleotide expression.Thus, in the treatment of disorders associated with increased TRICHexpression or activity, a compound which specifically inhibitsexpression of the polynucleotide encoding TRICH may be therapeuticallyuseful, and in the treatment of disorders associated with decreasedTRICH expression or activity, a compound which specifically promotesexpression of the polynucleotide encoding TRICH may be therapeuticallyuseful.

[0259] At least one, and up to a plurality, of test compounds may bescreened for effectiveness in altering expression of a specificpolynucleotide. A test compound may be obtained by any method commonlyknown in the art, including chemical modification of a compound known tobe effective in altering polynucleotide expression; selection from anexisting, commercially-available or proprietary library ofnaturally-occurring or non-natural chemical compounds; rational designof a compound based on chemical and/or structural properties of thetarget polynucleotide; and selection from a library of chemicalcompounds created combinatorially or randomly. A sample comprising apolynucleotide encoding TRICH is exposed to at least one test compoundthus obtained. The sample may comprise, for example, an intact orpermeabilized cell, or an in vitro cell-free or reconstitutedbiochemical system. Alterations in the expression of a polynucleotideencoding TRICH are assayed by any method commonly known in the art.Typically, the expression of a specific nucleotide is detected byhybridization with a probe having a nucleotide sequence complementary tothe sequence of the polynucleotide encoding TRICH. The amount ofhybridization may be quantified, thus forming the basis for a comparisonof the expression of the polynucleotide both with and without exposureto one or more test compounds. Detection of a change in the expressionof a polynucleotide exposed to a test compound indicates that the testcompound is effective in altering the expression of the polynucleotide.A screen for a compound effective in altering expression of a specificpolynucleotide can be carried out, for example, using aSchizosaccharomyces pombe gene expression system (Atkins, D. et al.(1999) U.S. Pat. No. 5,932,435; Arndt, G. M. et al. (2000) Nucleic AcidsRes. 28:E15) or a human cell line such as HeLa cell (Clarke, M. L. etal. (2000) Biochem. Biophys. Res. Commun. 268:8-13). A particularembodiment of the present invention involves screening a combinatoriallibrary of oligonucleotides (such as deoxyribonucleotides,ribonucleotides, peptide nucleic acids, and modified oligonucleotides)for antisense activity against a specific polynucleotide sequence(Bruice, T. W. et al. (1997) U.S. Pat. No. 5,686,242; Bruice, T. W. etal. (2000) U.S. Pat. No. 6,022,691).

[0260] Many methods for introducing vectors into cells or tissues areavailable and equally suitable for use in vivo, in vitro, and ex vivo.For ex vivo therapy, vectors may be introduced into stem cells takenfrom the patient and clonally propagated for autologous transplant backinto that same patient. Delivery by transfection, by liposomeinjections, or by polycationic amino polymers may be achieved usingmethods which are well known in the art. (See, e.g., Goldman, C. K. etal. (1997) Nat. Biotechnol. 15:462-466.)

[0261] Any of the therapeutic methods described above may be applied toany subject in need of such therapy, including, for example, mammalssuch as humans, dogs, cats, cows, horses, rabbits, and monkeys.

[0262] An additional embodiment of the invention relates to theadministration of a composition which generally comprises an activeingredient formulated with a pharmaceutically acceptable excipient.Excipients may include, for example, sugars, starches, celluloses, gums,and proteins. Various formulations are commonly known and are thoroughlydiscussed in the latest edition of Remington's Pharmaceutical Sciences(Maack Publishing, Easton Pa.). Such compositions may consist of TRICH,antibodies to TRICH, and mimetics, agonists, antagonists, or inhibitorsof TRICH.

[0263] The compositions utilized in this invention may be administeredby any number of routes including, but not limited to, oral,intravenous, intramuscular, intra-arterial, intramedullary, intrathecal,intraventricular, pulmonary, transdermal, subcutaneous, intraperitoneal,intranasal, enteral, topical, sublingual, or rectal means.

[0264] Compositions for pulmonary administration may be prepared inliquid or dry powder form. These compositions are generally aerosolizedimmediately prior to inhalation by the patient. In the case of smallmolecules (e.g. traditional low molecular weight organic drugs), aerosoldelivery of fast-acting formulations is well-known in the art. In thecase of macromolecules (e.g. larger peptides and proteins), recentdevelopments in the field of pulmonary delivery via the alveolar regionof the lung have enabled the practical delivery of drugs such as insulinto blood circulation (see, e.g., Patton, J. S. et al., U.S. Pat. No.5,997,848). Pulmonary delivery has the advantage of administrationwithout needle injection, and obviates the need for potentially toxicpenetration enhancers.

[0265] Compositions suitable for use in the invention includecompositions wherein the active ingredients are contained in aneffective amount to achieve the intended purpose. The determination ofan effective dose is well within the capability of those skilled in theart.

[0266] Specialized forms of compositions may be prepared for directintracellular delivery of macromolecules comprising TRICH or fragmentsthereof. For example, liposome preparations containing acell-impermeable macromolecule may promote cell fusion and intracellulardelivery of the macromolecule. Alternatively, TRICH or a fragmentthereof may be joined to a short cationic N-terminal portion from theHIV Tat-1 protein. Fusion proteins thus generated have been found totransduce into the cells of all tissues, including the brain, in a mousemodel system (Schwarze, S. R. et al. (1999) Science 285:1569-1572).

[0267] For any compound, the therapeutically effective dose can beestimated initially either in cell culture assays, e.g., of neoplasticcells, or in animal models such as mice, rats, rabbits, dogs, monkeys,or pigs. An animal model may also be used to determine the appropriateconcentration range and route of administration. Such information canthen be used to determine useful doses and routes for administration inhumans.

[0268] A therapeutically effective dose refers to that amount of activeingredient, for example TRICH or fragments thereof, antibodies of TRICH,and agonists, antagonists or inhibitors of TRICH, which ameliorates thesymptoms or condition. Therapeutic efficacy and toxicity may bedetermined by standard pharmaceutical procedures in cell cultures orwith experimental animals, such as by calculating the ED₅₀ (the dosetherapeutically effective in 50% of the population) or LD₅₀ (the doselethal to 50% of the population) statistics. The dose ratio of toxic totherapeutic effects is the therapeutic index, which can be expressed asthe LD₅₀/ED₅₀ ratio. Compositions which exhibit large therapeuticindices are preferred. The data obtained from cell culture assays andanimal studies are used to formulate a range of dosage for human use.The dosage contained in such compositions is preferably within a rangeof circulating concentrations that includes the ED₅₀ with little or notoxicity. The dosage varies within this range depending upon the dosageform employed, the sensitivity of the patient, and the route ofadministration.

[0269] The exact dosage will be determined by the practitioner, in lightof factors related to the subject requiring treatment. Dosage andadministration are adjusted to provide sufficient levels of the activemoiety or to maintain the desired effect. Factors which may be takeninto account include the severity of the disease state, the generalhealth of the subject, the age, weight, and gender of the subject, timeand frequency of administration, drug combination(s), reactionsensitivities, and response to therapy. Long-acting compositions may beadministered every 3 to 4 days, every week, or biweekly depending on thehalf-life and clearance rate of the particular formulation.

[0270] Normal dosage amounts may vary from about 0.1 μg to 100,000 μg,up to a total dose of about 1 gram, depending upon the route ofadministration. Guidance as to particular dosages and methods ofdelivery is provided in the literature and generally available topractitioners in the art. Those skilled in the art will employ differentformulations for nucleotides than for proteins or their inhibitors.Similarly, delivery of polynucleotides or polypeptides will be specificto particular cells, conditions, locations, etc.

DIAGNOSTICS

[0271] In another embodiment, antibodies which specifically bind TRICHmay be used for the diagnosis of disorders characterized by expressionof TRICH, or in assays to monitor patients being treated with TRICH oragonists, antagonists, or inhibitors of TRICH. Antibodies useful fordiagnostic purposes may be prepared in the same manner as describedabove for therapeutics. Diagnostic assays for TRICH include methodswhich utilize the antibody and a label to detect TRICH in human bodyfluids or in extracts of cells or tissues. The antibodies may be usedwith or without modification, and may be labeled by covalent ornon-covalent attachment of a reporter molecule. A wide variety ofreporter molecules, several of which are described above, are known inthe art and may be used.

[0272] A variety of protocols for measuring TRICH, including ELISAs,RIAs, and FACS, are known in the art and provide a basis for diagnosingaltered or abnormal levels of TRICH expression. Normal or standardvalues for TRICH expression are established by combining body fluids orcell extracts taken from normal mammalian subjects, for example, humansubjects, with antibodies to TRICH under conditions suitable for complexformation. The amount of standard complex formation may be quantitatedby various methods, such as photometric means. Quantities of TRICHexpressed in subject, control, and disease samples from biopsied tissuesare compared with the standard values. Deviation between standard andsubject values establishes the parameters for diagnosing disease.

[0273] In another embodiment of the invention, the polynucleotidesencoding TRICH may be used for diagnostic purposes. The polynucleotideswhich may be used include oligonucleotide sequences, complementary RNAand DNA molecules, and PNAs. The polynucleotides may be used to detectand quantify gene expression in biopsied tissues in which expression ofTRICH may be correlated with disease. The diagnostic assay may be usedto determine absence, presence, and excess expression of TRICH, and tomonitor regulation of TRICH levels during therapeutic intervention.

[0274] In one aspect, hybridization with PCR probes which are capable ofdetecting polynucleotide sequences, including genomic sequences,encoding TRICH or closely related molecules may be used to identifynucleic acid sequences which encode TRICH. The specificity of the probe,whether it is made from a highly specific region, e.g., the 5′regulatory region, or from a less specific region, e.g., a conservedmotif, and the stringency of the hybridization or amplification willdetermine whether the probe identifies only naturally occurringsequences encoding TRICH, allelic variants, or related sequences.

[0275] Probes may also be used for the detection of related sequences,and may have at least 50% sequence identity to any of the TRICH encodingsequences. The hybridization probes of the subject invention may be DNAor RNA and may be derived from the sequence of SEQ ID NO:21-40 or fromgenomic sequences including promoters, enhancers, and introns of theTRICH gene.

[0276] Means for producing specific hybridization probes for DNAsencoding TRICH include the cloning of polynucleotide sequences encodingTRICH or TRICH derivatives into vectors for the production of mRNAprobes. Such vectors are known in the art, are commercially available,and may be used to synthesize RNA probes in vitro by means of theaddition of the appropriate RNA polymerases and the appropriate labelednucleotides. Hybridization probes may be labeled by a variety ofreporter groups, for example, by radionuclides such as ³²P or ³⁵S, or byenzymatic labels, such as alkaline phosphatase coupled to the probe viaavidin/biotin coupling systems, and the like.

[0277] Polynucleotide sequences encoding TRICH may be used for thediagnosis of disorders associated with expression of TRICH. Examples ofsuch disorders include, but are not limited to, a transport disordersuch as akinesia, amyotrophic lateral sclerosis, ataxia telangiectasia,cystic fibrosis, Becker's muscular dystrophy, Bell's palsy,Charcot-Marie Tooth disease, diabetes mellitus, diabetes insipidus,diabetic neuropathy, Duchenne muscular dystrophy, hyperkalemic periodicparalysis, normokalemic periodic paralysis, Parkinson's disease,malignant hyperthermia, multidrug resistance, myasthenia gravis,myotonic dystrophy, catatonia, tardive dyskinesia, dystonias, peripheralneuropathy, cerebral neoplasms, prostate cancer, cardiac disordersassociated with transport, e.g., angina, bradyarrythmia, tachyarrythmia,hypertension, Long QT syndrome, myocarditis, cardiomyopathy, nemalinemyopathy, centronuclear myopathy, lipid myopathy, mitochondrialmyopathy, thyrotoxic myopathy, ethanol myopathy, dermatomyositis,inclusion body myositis, infectious myositis, polymyositis, neurologicaldisorders associated with transport, e.g., Alzheimer's disease, amnesia,bipolar disorder, dementia, depression, epilepsy, Tourette's disorder,paranoid psychoses, and schizophrenia, and other disorders associatedwith transport, e.g., neurofibromatosis, postherpetic neuralgia,trigeminal neuropathy, sarcoidosis, sickle cell anemia, Wilson'sdisease, cataracts, infertility, pulmonary artery stenosis,sensorineural autosomal deafness, hyperglycemia, hypoglycemia, Grave'sdisease, goiter, Cushing's disease, Addison's disease, glucose-galactosemalabsorption syndrome, glycogen storage disease, hypercholesterolemia,adrenoleukodystrophy, Zellweger syndrome, Menkes disease, occipital hornsyndrome, von Gierke disease, pseudohypoaldosteronism type 1, Liddle'ssyndrome, cystinuria, iminoglycinuria, Hartup disease, Fanconi disease,and Bartter syndrome; a neurological disorder such as epilepsy, ischemiccerebrovascular disease, stroke, cerebral neoplasms, Alzheimer'sdisease, Pick's disease, Huntington's disease, dementia, Parkinson'sdisease and other extrapyramidal disorders, amyotrophic lateralsclerosis and other motor neuron disorders, progressive neural muscularatrophy, retinitis pigmentosa, hereditary ataxias, multiple sclerosisand other demyelinating diseases, bacterial and viral meningitis, brainabscess, subdural empyema, epidural abscess, suppurative intracranialthrombophlebitis, myelitis and radiculitis, viral central nervous systemdisease, prion diseases including kuru, Creutzfeldt-Jakob disease, andGerstmann-Straussler-Scheinker syndrome, fatal familial insomnia,nutritional and metabolic diseases of the nervous system,neurofibromatosis, tuberous sclerosis, cerebelloretinalhemangioblastomatosis, encephalotrigeminal syndrome, mental retardationand other developmental disorders of the central nervous systemincluding Down syndrome, cerebral palsy, neuroskeletal disorders,autonomic nervous system disorders, cranial nerve disorders, spinal corddiseases, muscular dystrophy and other neuromuscular disorders,peripheral nervous system disorders, dermatomyositis and polymyositis,inherited, metabolic, endocrine, and toxic myopathies, myastheniagravis, periodic paralysis, mental disorders including mood, anxiety,and schizophrenic disorders, seasonal affective disorder (SAD),akathesia, amnesia, catatonia, diabetic neuropathy, hemiplegic migraine,tardive dyskinesia, dystonias, paranoid psychoses, postherpeticneuralgia, Tourette's disorder, progressive supranuclear palsy,corticobasal degeneration, and familial frontotemporal dementia; amuscle disorder such as cardiomyopathy, myocarditis, Duchenne's musculardystrophy, Becker's muscular dystrophy, myotonic dystrophy, central coredisease, nemaline myopathy, centronuclear myopathy, lipid myopathy,mitochondrial myopathy, infectious myositis, polymyositis,dermatomyositis, inclusion body myositis, thyrotoxic myopathy, ethanolmyopathy, angina, anaphylactic shock, arrhythmias, asthma,cardiovascular shock, Cushing's syndrome, hypertension, hypoglycemia,myocardial infarction, migraine, pheochromocytoma, and myopathiesincluding encephalopathy, epilepsy, Kearns-Sayre syndrome, lacticacidosis, myoclonic disorder, ophthalmoplegia, acid maltase deficiency(AMD, also known as Pompe's disease), generalized myotonia, and myotoniacongenita; an immunological disorder such as acquired immunodeficiencysyndrome (AIDS), Addison's disease, adult respiratory distress syndrome,allergies, ankylosing spondylitis, amyloidosis, anemia, asthma,atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis,autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED),bronchitis, cholecystitis, contact dermatitis, Crohn's disease, atopicdermatitis, dermatomyositis, diabetes mellitus, emphysema, episodiclymphopenia with lymphocytotoxins, erythroblastosis fetalis, erythemanodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome,gout, Graves' disease, Hashimoto's thyroiditis, hypereosinophilia,irritable bowel syndrome, multiple sclerosis, myasthenia gravis,myocardial or pericardial inflammation, osteoarthritis, osteoporosis,pancreatitis, polymyositis, psoriasis, Reiter's syndrome, rheumatoidarthritis, scieroderma, Sjögren's syndrome, systemic anaphylaxis,systemic lupus erythematosus, systemic sclerosis, thrombocytopenicpurpura, ulcerative colitis, uveitis, Werner syndrome, complications ofcancer, hemodialysis, and extracorporeal circulation, viral, bacterial,fungal, parasitic, protozoal, and helminthic infections, and trauma; anda cell proliferative disorder such as actinic keratosis,arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixedconnective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnalhemoglobinuria, polycythemia vera, psoriasis, primary thrombocythemia,and cancers including adenocarcinoma, leukemia, lymphoma, melanoma,myeloma, sarcoma, teratocarcinoma, and, in particular, cancers of theadrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gallbladder, ganglia, gastrointestinal tract, heart, kidney, liver, lung,muscle, ovary, pancreas, parathyroid, penis, prostate, salivary glands,skin, spleen, testis, thymus, thyroid, and uterus. The polynucleotidesequences encoding TRICH may be used in Southern or northern analysis,dot blot, or other membrane-based technologies; in PCR technologies; indipstick, pin, and multiformat ELISA-like assays; and in microarraysutilizing fluids or tissues from patients to detect altered TRICHexpression. Such qualitative or quantitative methods are well known inthe art.

[0278] In a particular aspect, the nucleotide sequences encoding TRICHmay be useful in assays that detect the presence of associateddisorders, particularly those mentioned above. The nucleotide sequencesencoding TRICH may be labeled by standard methods and added to a fluidor tissue sample from a patient under conditions suitable for theformation of hybridization complexes. After a suitable incubationperiod, the sample is washed and the signal is quantified and comparedwith a standard value. If the amount of signal in the patient sample issignificantly altered in comparison to a control sample then thepresence of altered levels of nucleotide sequences encoding TRICH in thesample indicates the presence of the associated disorder. Such assaysmay also be used to evaluate the efficacy of a particular therapeutictreatment regimen in animal studies, in clinical trials, or to monitorthe treatment of an individual patient.

[0279] In order to provide a basis for the diagnosis of a disorderassociated with expression of TRICH, a normal or standard profile forexpression is established. This may be accomplished by combining bodyfluids or cell extracts taken from normal subjects, either animal orhuman, with a sequence, or a fragment thereof, encoding TRICH, underconditions suitable for hybridization or amplification. Standardhybridization may be quantified by comparing the values obtained fromnormal subjects with values from an experiment in which a known amountof a substantially purified polynucleotide is used. Standard valuesobtained in this manner may be compared with values obtained fromsamples from patients who are symptomatic for a disorder. Deviation fromstandard values is used to establish the presence of a disorder.

[0280] Once the presence of a disorder is established and a treatmentprotocol is initiated, hybridization assays may be repeated on a regularbasis to determine if the level of expression in the patient begins toapproximate that which is observed in the normal subject. The resultsobtained from successive assays may be used to show the efficacy oftreatment over a period ranging from several days to months.

[0281] With respect to cancer, the presence of an abnormal amount oftranscript (either under- or overexpressed) in biopsied tissue from anindividual may indicate a predisposition for the development of thedisease, or may provide a means for detecting the disease prior to theappearance of actual clinical symptoms. A more definitive diagnosis ofthis type may allow health professionals to employ preventative measuresor aggressive treatment earlier thereby preventing the development orfurther progression of the cancer.

[0282] Additional diagnostic uses for oligonucleotides designed from thesequences encoding TRICH may involve the use of PCR. These oligomers maybe chemically synthesized, generated enzymatically, or produced invitro. Oligomers will preferably contain a fragment of a polynucleotideencoding TRICH, or a fragment of a polynucleotide complementary to thepolynucleotide encoding TRICH, and will be employed under optimizedconditions for identification of a specific gene or condition. Oligomersmay also be employed under less stringent conditions for detection orquantification of closely related DNA or RNA sequences.

[0283] In a particular aspect, oligonucleotide primers derived from thepolynucleotide sequences encoding TRICH may be used to detect singlenucleotide polymorphisms (SNPs). SNPs are substitutions, insertions anddeletions that are a frequent cause of inherited or acquired geneticdisease in humans. Methods of SNP detection include, but are not limitedto, single-stranded conformation polymorphism (SSCP) and fluorescentSSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived from thepolynucleotide sequences encoding TRICH are used to amplify DNA usingthe polymerase chain reaction (PCR). The DNA may be derived, forexample, from diseased or normal tissue, biopsy samples, bodily fluids,and the like. SNPs in the DNA cause differences in the secondary andtertiary structures of PCR products in single-stranded form, and thesedifferences are detectable using gel electrophoresis in non-denaturinggels. In fSCCP, the oligonucleotide primers are fluorescently labeled,which allows detection of the amplimers in high-throughput equipmentsuch as DNA sequencing machines. Additionally, sequence databaseanalysis methods, termed in silico SNP (isSNP), are capable ofidentifying polymorphisms by comparing the sequence of individualoverlapping DNA fragments which assemble into a common consensussequence. These computer-based methods filter out sequence variationsdue to laboratory preparation of DNA and sequencing errors usingstatistical models and automated analyses of DNA sequence chromatograms.In the alternative, SNPs may be detected and characterized by massspectrometry using, for example, the high throughput MASSARRAY system(Sequenom, Inc., San Diego Calif.).

[0284] SNPs may be used to study the genetic basis of human disease. Forexample, at least 16 common SNPs have been associated withnon-insulin-dependent diabetes mellitus. SNPs are also useful forexamining differences in disease outcomes in monogenic disorders, suchas cystic fibrosis, sickle cell anemia, or chronic granulomatousdisease. For example, variants in the mannose-binding lectin, MBL2, havebeen shown to be correlated with deleterious pulmonary outcomes incystic fibrosis. SNPs also have utility in pharmacogenomics, theidentification of genetic variants that influence a patient's responseto a drug, such as life-threatening toxicity. For example, a variationin N-acetyl transferase is associated with a high incidence ofperipheral neuropathy in response to the anti-tuberculosis drugisoniazid, while a variation in the core promoter of the ALOX5 generesults in diminished clinical response to treatment with an anti-asthmadrug that targets the 5-lipoxygenase pathway. Analysis of thedistribution of SNPs in different populations is useful forinvestigating genetic drift, mutation, recombination, and selection, aswell as for tracing the origins of populations and their migrations.(Taylor, J. G. et al. (2001) Trends Mol. Med. 7:507-512; Kwok, P.-Y. andZ. Gu (1999) Mol. Med. Today 5:538-543; Nowotny, P. et al. (2001) Curr.Opin. Neurobiol. 11:637-641.)

[0285] Methods which may also be used to quantify the expression ofTRICH include radiolabeling or biotinylating nucleotides,coamplification of a control nucleic acid, and interpolating resultsfrom standard curves. (See, e.g., Melby, P. C. et al. (1993) J. Immunol.Methods 159:235-244; Duplaa, C. et al. (1993) Anal. Biochem.212:229-236.) The speed of quantitation of multiple samples may beaccelerated by running the assay in a high-throughput format where theoligomer or polynucleotide of interest is presented in various dilutionsand a spectrophotometric or colorimetric response gives rapidquantitation.

[0286] In further embodiments, oligonucleotides or longer fragmentsderived from any of the polynucleotide sequences described herein may beused as elements on a microarray. The microarray can be used intranscript imaging techniques which monitor the relative expressionlevels of large numbers of genes simultaneously as described below. Themicroarray may also be used to identify genetic variants, mutations, andpolymorphisms. This information may be used to determine gene function,to understand the genetic basis of a disorder, to diagnose a disorder,to monitor progression/regression of disease as a function of geneexpression, and to develop and monitor the activities of therapeuticagents in the treatment of disease. In particular, this information maybe used to develop a pharmacogenomic profile of a patient in order toselect the most appropriate and effective treatment regimen for thatpatient. For example, therapeutic agents which are highly effective anddisplay the fewest side effects may be selected for a patient based onhis/her pharmacogenomic profile.

[0287] In another embodiment, TRICH, fragments of TRICH, or antibodiesspecific for TRICH may be used as elements on a microarray. Themicroarray may be used to monitor or measure protein-proteininteractions, drug-target interactions, and gene expression profiles, asdescribed above.

[0288] A particular embodiment relates to the use of the polynucleotidesof the present invention to generate a transcript image of a tissue orcell type. A transcript image represents the global pattern of geneexpression by a particular tissue or cell type. Global gene expressionpatterns are analyzed by quantifying the number of expressed genes andtheir relative abundance under given conditions and at a given time.(See Seilhamer et al., “Comparative Gene Transcript Analysis,” U.S. Pat.No. 5,840,484, expressly incorporated by reference herein.) Thus atranscript image may be generated by hybridizing the polynucleotides ofthe present invention or their complements to the totality oftranscripts or reverse transcripts of a particular tissue or cell type.In one embodiment, the hybridization takes place in high-throughputformat, wherein the polynucleotides of the present invention or theircomplements comprise a subset of a plurality of elements on amicroarray. The resultant transcript image would provide a profile ofgene activity.

[0289] Transcript images may be generated using transcripts isolatedfrom tissues, cell lines, biopsies, or other biological samples. Thetranscript image may thus reflect gene expression in vivo, as in thecase of a tissue or biopsy sample, or in vitro, as in the case of a cellline.

[0290] Transcript images which profile the expression of thepolynucleotides of the present invention may also be used in conjunctionwith in vitro model systems and preclinical evaluation ofpharmaceuticals, as well as toxicological testing of industrial andnaturally-occurring environmental compounds. All compounds inducecharacteristic gene expression patterns, frequently termed molecularfingerprints or toxicant signatures, which are indicative of mechanismsof action and toxicity (Nuwaysir, E. F. et al. (1999) Mol. Carcinog.24:153-159; Steiner, S. and N. L. Anderson (2000) Toxicol. Lett.112-113:467-471, expressly incorporated by reference herein). If a testcompound has a signature similar to that of a compound with knowntoxicity, it is likely to share those toxic properties. Thesefingerprints or signatures are most useful and refined when they containexpression information from a large number of genes and gene families.Ideally, a genome-wide measurement of expression provides the highestquality signature. Even genes whose expression is not altered by anytested compounds are important as well, as the levels of expression ofthese genes are used to normalize the rest of the expression data. Thenormalization procedure is useful for comparison of expression dataafter treatment with different compounds. While the assignment of genefunction to elements of a toxicant signature aids in interpretation oftoxicity mechanisms, knowledge of gene function is not necessary for thestatistical matching of signatures which leads to prediction oftoxicity. (See, for example, Press Release 00-02 from the NationalInstitute of Environmental Health Sciences, released Feb. 29, 2000,available at http://www.niehs.nih.gov/oc/news/toxchip.htm.) Therefore,it is important and desirable in toxicological screening using toxicantsignatures to include all expressed gene sequences.

[0291] In one embodiment, the toxicity of a test compound is assessed bytreating a biological sample containing nucleic acids with the testcompound. Nucleic acids that are expressed in the treated biologicalsample are hybridized with one or more probes specific to thepolynucleotides of the present invention, so that transcript levelscorresponding to the polynucleotides of the present invention may bequantified. The transcript levels in the treated biological sample arecompared with levels in an untreated biological sample. Differences inthe transcript levels between the two samples are indicative of a toxicresponse caused by the test compound in the treated sample.

[0292] Another particular embodiment relates to the use of thepolypeptide sequences of the present invention to analyze the proteomeof a tissue or cell type. The term proteome refers to the global patternof protein expression in a particular tissue or cell type. Each proteincomponent of a proteome can be subjected individually to furtheranalysis. Proteome expression patterns, or profiles, are analyzed byquantifying the number of expressed proteins and their relativeabundance under given conditions and at a given time. A profile of acell's proteome may thus be generated by separating and analyzing thepolypeptides of a particular tissue or cell type. In one embodiment, theseparation is achieved using two-dimensional gel electrophoresis, inwhich proteins from a sample are separated by isoelectric focusing inthe first dimension, and then according to molecular weight by sodiumdodecyl sulfate slab gel electrophoresis in the second dimension(Steiner and Anderson, supra). The proteins are visualized in the gel asdiscrete and uniquely positioned spots, typically by staining the gelwith an agent such as Coomassie Blue or silver or fluorescent stains.The optical density of each protein spot is generally proportional tothe level of the protein in the sample. The optical densities ofequivalently positioned protein spots from different samples, forexample, from biological samples either treated or untreated with a testcompound or therapeutic agent, are compared to identify any changes inprotein spot density related to the treatment. The proteins in the spotsare partially sequenced using, for example, standard methods employingchemical or enzymatic cleavage followed by mass spectrometry. Theidentity of the protein in a spot may be determined by comparing itspartial sequence, preferably of at least 5 contiguous amino acidresidues, to the polypeptide sequences of the present invention. In somecases, further sequence data may be obtained for definitive proteinidentification.

[0293] A proteomic profile may also be generated using antibodiesspecific for TRICH to quantify the levels of TRICH expression. In oneembodiment, the antibodies are used as elements on a microarray, andprotein expression levels are quantified by exposing the microarray tothe sample and detecting the levels of protein bound to each arrayelement (Lueking, A. et al. (1999) Anal. Biochem. 270:103-111; Mendoze,L. G. et al. (1999) Biotechniques 27:778-788). Detection may beperformed by a variety of methods known in the art, for example, byreacting the proteins in the sample with a thiol- or amino-reactivefluorescent compound and detecting the amount of fluorescence bound ateach array element.

[0294] Toxicant signatures at the proteome level are also useful fortoxicological screening, and should be analyzed in parallel withtoxicant signatures at the transcript level. There is a poor correlationbetween transcript and protein abundances for some proteins in sometissues (Anderson, N. L. and J. Seilhamer (1997) Electrophoresis18:533-537), so proteome toxicant signatures may be useful in theanalysis of compounds which do not significantly affect the transcriptimage, but which alter the proteomic profile. In addition, the analysisof transcripts in body fluids is difficult, due to rapid degradation ofmRNA, so proteomic profiling may be more reliable and informative insuch cases.

[0295] In another embodiment, the toxicity of a test compound isassessed by treating a biological sample containing proteins with thetest compound. Proteins that are expressed in the treated biologicalsample are separated so that the amount of each protein can bequantified. The amount of each protein is compared to the amount of thecorresponding protein in an untreated biological sample. A difference inthe amount of protein between the two samples is indicative of a toxicresponse to the test compound in the treated sample. Individual proteinsare identified by sequencing the amino acid residues of the individualproteins and comparing these partial sequences to the polypeptides ofthe present invention.

[0296] In another embodiment, the toxicity of a test compound isassessed by treating a biological sample containing proteins with thetest compound. Proteins from the biological sample are incubated withantibodies specific to the polypeptides of the present invention. Theamount of protein recognized by the antibodies is quantified. The amountof protein in the treated biological sample is compared with the amountin an untreated biological sample. A difference in the amount of proteinbetween the two samples is indicative of a toxic response to the testcompound in the treated sample.

[0297] Microarrays may be prepared, used, and analyzed using methodsknown in the art. (See, e.g., Brennan, T. M. et al. (1995) U.S. Pat. No.5,474,796; Schena, M. et al. (1996) Proc. Natl. Acad. Sci. USA93:10614-10619; Baldeschweiler et al. (1995) PCT applicationWO95/251116; Shalon, D. et al. (1995) PCT application WO95/35505;Heller, R. A. et al. (1997) Proc. Natl. Acad. Sci. USA 94:2150-2155; andHeller, M. J. et al. (1997) U.S. Pat. No. 5,605,662.) Various types ofmicroarrays are well known and thoroughly d scribed in DNA Microarrays:A Practical Approach, M. Schena, ed. (1999) Oxford University Press,London, hereby expressly incorporated by reference.

[0298] In another embodiment of the invention, nucleic acid sequencesencoding TRICH may be used to generate hybridization probes useful inmapping the naturally occurring genomic sequence. Either coding ornoncoding sequences may be used, and in some instances, noncodingsequences may be preferable over coding sequences. For example,conservation of a coding sequence among members of a multi-gene familymay potentially cause undesired cross hybridization during chromosomalmapping. The sequences may be mapped to a particular chromosome, to aspecific region of a chromosome, or to artificial chromosomeconstructions, e.g., human artificial chromosomes (HACs), yeastartificial chromosomes (YACs), bacterial artificial chromosomes (BACs),bacterial P1 constructions, or single chromosome cDNA libraries. (See,e.g., Harrington, J. J. et al. (1997) Nat. Genet. 15:345-355; Price, C.M. (1993) Blood Rev. 7:127-134; and Trask, B. J. (1991) Trends Genet.7:149-154.) Once mapped, the nucleic acid sequences of the invention maybe used to develop genetic linkage maps, for example, which correlatethe inheritance of a disease state with the inheritance of a particularchromosome region or restriction fragment length polymorphism (RFLP).(See, for example, Lander, E. S. and D. Botstein (1986) Proc. Natl.Acad. Sci. USA 83:7353-7357.)

[0299] Fluorescent in situ hybridization (FISH) may be correlated withother physical and genetic map data. (See, e.g., Heinz-Ulrich, et al.(1995) in Meyers, supra, pp. 965-968.) Examples of genetic map data canbe found in various scientific journals or at the Online MendelianInheritance in Man (OMIM) World Wide Web site. Correlation between thelocation of the gene encoding TRICH on a physical map and a specificdisorder, or a predisposition to a specific disorder, may help definethe region of DNA associated with that disorder and thus may furtherpositional cloning efforts.

[0300] In situ hybridization of chromosomal preparations and physicalmapping techniques, such as linkage analysis using establishedchromosomal markers, may be used for extending genetic maps. Often theplacement of a gene on the chromosome of another mammalian species, suchas mouse, may reveal associated markers even if the exact chromosomallocus is not known. This information is valuable to investigatorssearching for disease genes using positional cloning or other genediscovery techniques. Once the gene or genes responsible for a diseaseor syndrome have been crudely localized by genetic linkage to aparticular genomic region, e.g., ataxia-telangiectasia to 11q22-23, anysequences mapping to that area may represent associated or regulatorygenes for further investigation. (See, e.g., Gatti, R. A. et al. (1988)Nature 336:577-580.) The nucleotide sequence of the instant inventionmay also be used to detect differences in the chromosomal location dueto translocation, inversion, etc., among normal, carrier, or affectedindividuals.

[0301] In another embodiment of the invention, TRICH, its catalytic orimmunogenic fragments, or oligopeptides thereof can be used forscreening libraries of compounds in any of a variety of drug screeningtechniques. The fragment employed in such screening may be free insolution, affixed to a solid support, borne on a cell surface, orlocated intracellularly. The formation of binding complexes betweenTRICH and the agent being tested may be measured.

[0302] Another technique for drug screening provides for high throughputscreening of compounds having suitable binding affinity to the proteinof interest. (See, e.g., Geysen, et al. (1984) PCT applicationWO84/03564.) In this method, large numbers of different small testcompounds are synthesized on a solid substrate. The test compounds arereacted with TRICH, or fragments thereof, and washed. Bound TRICH isthen detected by methods well known in the art. Purified TRICH can alsobe coated directly onto plates for use in the aforementioned drugscreening techniques. Alternatively, non-neutralizing antibodies can beused to capture the peptide and immobilize it on a solid support.

[0303] In another embodiment, one may use competitive drug screeningassays in which neutralizing antibodies capable of binding TRICHspecifically compete with a test compound for binding TRICH. In thismanner, antibodies can be used to detect the presence of any peptidewhich shares one or more antigenic determinants with TRICH.

[0304] In additional embodiments, the nucleotide sequences which encodeTRICH may be used in any molecular biology techniques that have yet tobe developed, provided the new techniques rely on properties ofnucleotide sequences that are currently known, including, but notlimited to, such properties as the triplet genetic code and specificbase pair interactions.

[0305] Without further elaboration, it is believed that one skilled inthe art can, using the preceding description, utilize the presentinvention to its fullest extent. The following embodiments are,therefore, to be construed as merely illustrative, and not limitative ofthe remainder of the disclosure in any way whatsoever.

[0306] The disclosures of all patents, applications and publications,mentioned above and below, in particular U.S. Ser. No. 60/267,892, U.S.Ser. No. 60/271,168, U.S. Ser. No. 60/272,890, U.S. Ser. No. 60/276,860,U.S. Ser. No. 60/278,255, U.S. Ser. No. 60/280,538 and U.S. Ser. No.[Attorney Docket No. PF-1366, filed Jan. 25, 2002] are expresslyincorporated by reference herein.

EXAMPLES

[0307] I. C nstructi n of cDNA Libraries

[0308] Incyte cDNAs were derived from cDNA libraries described in theLIFESEQ GOLD database (Incyte Genomics, Palo Alto Calif.). Some tissueswere homogenized and lysed in guanidinium isothiocyanate, while otherswere homogenized and lysed in phenol or in a suitable mixture ofdenaturants, such as TRIZOL (Life Technologies), a monophasic solutionof phenol and guanidine isothiocyanate. The resulting lysates werecentrifuged over CsCl cushions or extracted with chloroform. RNA wasprecipitated from the lysates with either isopropanol or sodium acetateand ethanol, or by other routine methods.

[0309] Phenol extraction and precipitation of RNA were repeated asnecessary to increase RNA purity. In some cases, RNA was treated withDNase. For most libraries, poly(A)+ RNA was isolated using oligod(T)-coupled paramagnetic particles (Promega), OLIGOTEX latex particles(QIAGEN, Chatsworth Calif.), or an OLIGOTEX mRNA purification kit(QIAGEN). Alternatively, RNA was isolated directly from tissue lysatesusing other RNA isolation kits, e.g., the POLY(A)PURE mRNA purificationkit (Ambion, Austin Tex.).

[0310] In some cases, Stratagene was provided with RNA and constructedthe corresponding cDNA libraries. Otherwise, cDNA was synthesized andcDNA libraries were constructed with the UNIZAP vector system(Stratagene) or SUPERSCRIPT plasmid system (Life Technologies), usingthe recommended procedures or similar methods known in the art. (See,e.g., Ausubel, 1997, supra, units 5.1-6.6.) Reverse transcription wasinitiated using oligo d(T) or random primers. Synthetic oligonucleotideadapters were ligated to double stranded cDNA, and the cDNA was digestedwith the appropriate restriction enzyme or enzymes. For most libraries,the cDNA was size-selected (300-1000 bp) using SEPHACRYL S1000,SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (AmershamPharmacia Biotech) or preparative agarose gel electrophoresis. cDNAswere ligated into compatible restriction enzyme sites of the polylinkerof a suitable plasmid, e.g., PBLUESCRIPT plasmid (Stratagene), PSPORT1plasmid (Life Technologies), PCDNA2.1 plasmid (Invitrogen, CarlsbadCalif.), PBK-CMV plasmid (Stratagene), PCR2-TOPOTA plasmid (Invitrogen),PCMV-ICIS plasmid (Stratagene), pIGEN (Incyte Genomics, Palo AltoCalif.), pRARE (Incyte Genomics), or pINCY (Incyte Genomics), orderivatives thereof. Recombinant plasmids were transformed intocompetent E. coli cells including XL1-Blue, XL1-BlueMRF, or SOLR fromStratagene or DH5α, DH10B, or ElectroMAX DH10B from Life Technologies.

[0311] II. Isolation of cDNA Clones

[0312] Plasmids obtained as described in Example I were recovered fromhost cells by in vivo excision using the UNIZAP vector system(Stratagene) or by cell lysis. Plasmids were purified using at least oneof the following: a Magic or WIZARD Minipreps DNA purification system(Promega); an AGTC Miniprep purification kit (Edge Biosystems,Gaithersburg Md.); and QIAWELL 8 Plasmid, QIAWELL 8 Plus Plasmid,QIAWELL 8 Ultra Plasmid purification systems or the R.E.A.L. PREP 96plasmid purification kit from QIAGEN. Following precipitation, plasmidswere resuspended in 0.1 ml of distilled water and stored, with orwithout lyophilization, at 4° C.

[0313] Alternatively, plasmid DNA was amplified from host cell lysatesusing direct link PCR in a high-throughput format (Rao, V. B. (1994)Anal. Biochem. 216:1-14). Host cell lysis and thermal cycling steps werecarried out in a single reaction mixture. Samples were processed andstored in 384-well plates, and the concentration of amplified plasmidDNA was quantified fluorometrically using PICOGREEN dye (MolecularProbes, Eugene Oreg.) and a FLUOROSKAN II fluorescence scanner(Labsystems Oy, Helsinki, Finland).

[0314] III. Sequencing and Analysis

[0315] Incyte cDNA recovered in plasmids as described in Example II weresequenced as follows. Sequencing reactions were processed using standardmethods or high-throughput instrumentation such as the ABI CATALYST 800(Applied Biosystems) thermal cycler or the PTC-200 thermal cycler (MJResearch) in conjunction with the HYDRA microdispenser (RobbinsScientific) or the MICROLAB 2200 (Hamilton) liquid transfer system. cDNAsequencing reactions were prepared using reagents provided by AmershamPharmacia Biotech or supplied in ABI sequencing kits such as the ABIPRISM BIGDYE Terminator cycle sequencing ready reaction kit (AppliedBiosystems). Electrophoretic separation of cDNA sequencing reactions anddetection of labeled polynucleotides were carried out using the MEGABACE1000 DNA sequencing system (Molecular Dynamics); the ABI PRISM 373 or377 sequencing system (Applied Biosystems) in conjunction with standardABI protocols and base calling software; or other sequence analysissystems known in the art. Reading frames within the cDNA sequences wereidentified using standard methods (reviewed in Ausubel, 1997, supra,unit 7.7). Some of the cDNA sequences were selected for extension usingthe techniques disclosed in Example VIII.

[0316] The polynucleotide sequences derived from Incyte cDNAs werevalidated by removing vector, linker, and poly(A) sequences and bymasking ambiguous bases, using algorithms and programs based on BLAST,dynamic programming, and dinucleotide nearest neighbor analysis. TheIncyte cDNA sequences or translations thereof were then queried againsta selection of public databases such as the GenBank primate, rodent,mammalian, vertebrate, and eukaryote databases, and BLOCKS, PRINTS,DOMO, PRODOM; PROTEOME databases with sequences from Homo sapiens,Rattus norvegicus, Mus musculus, Caenorhabditis elegans, Saccharomycescerevisiae, Schizosaccharomyces pombe, and Candida albicans (IncyteGenomics, Palo Alto Calif.); hidden Markov model (HMM)-based proteinfamily databases such as PFAM; and HMM-based protein domain databasessuch as SMART (Schultz et al. (1998) Proc. Natl. Acad. Sci. USA95:5857-5864; Letunic, I. et al. (2002) Nucleic Acids Res. 30:242-244).(HMM is a probabilistic approach which analyzes consensus primarystructures of gene families. See, for example, Eddy, S. R. (1996) Curr.Opin. Struct. Biol. 6:361-365.) The queries were performed usingprograms based on BLAST, FASTA, BLIMPS, and HMMER. The Incyte cDNAsequences were assembled to produce full length polynucleotidesequences. Alternatively, GenBank cDNAs, GenBank ESTs, stitchedsequences, stretched sequences, or Genscan-predicted coding sequences(see Examples IV and V) were used to extend Incyte cDNA assemblages tofull length. Assembly was performed using programs based on Phred,Phrap, and Consed, and cDNA assemblages were screened for open readingframes using programs based on GeneMark, BLAST, and FASTA. The fulllength polynucleotide sequences were translated to derive thecorresponding full length polypeptide sequences. Alternatively, apolypeptide of the invention may begin at any of the methionine residuesof the full length translated polypeptide. Full length polypeptidesequences were subsequently analyzed by querying against databases suchas the GenBank protein databases (genpept), SwissProt, the PROTEOMEdatabases, BLOCKS, PRINTS, DOMO, PRODOM, Prosite, hidden Markov model(HMM)-based protein family databases such as PFAM; and HMM-based proteindomain databases such as SMART. Full length polynucleotide sequences arealso analyzed using MACDNASIS PRO software (Hitachi SoftwareEngineering, South San Francisco Calif.) and LASERGENE software(DNASTAR). Polynucleotide and polypeptide sequence alignments aregenerated using default parameters specified by the CLUSTAL algorithm asincorporated into the MEGALIGN multisequence alignment program(DNASTAR), which also calculates the percent identity between alignedsequences.

[0317] Table 7 summarizes the tools, programs, and algorithms used forthe analysis and assembly of Incyte cDNA and full length sequences andprovides applicable descriptions, references, and threshold parameters.The first column of Table 7 shows the tools, programs, and algorithmsused, the second column provides brief descriptions thereof, the thirdcolumn presents appropriate references, all of which are incorporated byreference herein in their entirety, and the fourth column presents,where applicable, the scores, probability values, and other parametersused to evaluate the strength of a match between two sequences (thehigher the score or the lower the probability value, the greater theidentity between two sequences).

[0318] The programs described above for the assembly and analysis offull length polynucleotide and polypeptide sequences were also used toidentify polynucleotide sequence fragments from SEQ ID NO:21-40.Fragments from about 20 to about 4000 nucleotides which are useful inhybridization and amplification technologies are described in Table 4,column 2.

[0319] IV. Identificati n and Editing f Coding Sequences fr m GenomicDNA

[0320] Putative transporters and ion channels were initially identifiedby running the Genscan gene identification program against publicgenomic sequence databases (e.g., gbpri and gbhtg). Genscan is ageneral-purpose gene identification program which analyzes genomic DNAsequences from a variety of organisms (See Burge, C. and S. Karlin(1997) J. Mol. Biol. 268:78-94, and Burge, C. and S. Karlin (1998) Curr.Opin. Struct. Biol. 8:346-354). The program concatenates predicted exonsto form an assembled cDNA sequence extending from a methionine to a stopcodon. The output of Genscan is a FASTA database of polynucleotide andpolypeptide sequences. The maximum range of sequence for Genscan toanalyze at once was set to 30 kb. To determine which of these Genscanpredicted cDNA sequences encode transporters and ion channels, theencoded polypeptides were analyzed by querying against PFAM models fortransporters and ion channels. Potential transporters and ion channelswere also identified by homology to Incyte cDNA sequences that had beenannotated as transporters and ion channels. These selectedGenscan-predicted sequences were then compared by BLAST analysis to thegenpept and gbpri public databases. Where necessary, theGenscan-predicted sequences were then edited by comparison to the topBLAST hit from genpept to correct errors in the sequence predicted byGenscan, such as extra or omitted exons. BLAST analysis was also used tofind any Incyte cDNA or public cDNA coverage of the Genscan-predictedsequences, thus providing evidence for transcription. When Incyte cDNAcoverage was available, this information was used to correct or confirmthe Genscan predicted sequence. Full length polynucleotide sequenceswere obtained by assembling Genscan-predicted coding sequences withIncyte cDNA sequences and/or public cDNA sequences using the assemblyprocess described in Example III. Alternatively, full lengthpolynucleotide sequences were derived entirely from edited or uneditedGenscan-predicted coding sequences.

[0321] V. Assembly of Genomic Sequence Data with cDNA Sequence Data

[0322] “Stitched” Sequences

[0323] Partial cDNA sequences were extended with exons predicted by theGenscan gene identification program described in Example IV. PartialcDNAs assembled as described in Example III were mapped to genomic DNAand parsed into clusters containing related cDNAs and Genscan exonpredictions from one or more genomic sequences. Each cluster wasanalyzed using an algorithm based on graph theory and dynamicprogramming to integrate cDNA and genomic information, generatingpossible splice variants that were subsequently confirmed, edited, orextended to create a full length sequence. Sequence intervals in whichthe entire length of the interval was present on more than one sequencein the cluster were identified, and intervals thus identified wereconsidered to be equivalent by transitivity. For example, if an intervalwas present on a cDNA and two genomic sequences, then all threeintervals were considered to be equivalent. This process allowsunrelated but consecutive genomic sequences to be brought together,bridged by cDNA sequence. Intervals thus identified were then “stitched”together by the stitching algorithm in the order that they appear alongtheir parent sequences to generate the longest possible sequence, aswell as sequence variants. Linkages between intervals which proceedalong one type of parent sequence (cDNA to cDNA or genomic sequence togenomic sequence) were given preference over linkages which changeparent type (cDNA to genomic sequence). The resultant stitched sequenceswere translated and compared by BLAST analysis to the genpept and gbpripublic databases. Incorrect exons predicted by Genscan were corrected bycomparison to the top BLAST hit from genpept. Sequences were furtherextended with additional cDNA sequences, or by inspection of genomicDNA, when necessary.

[0324] “Stretched” Sequences

[0325] Partial DNA sequences were extended to full length with analgorithm based on BLAST analysis. First, partial cDNAs assembled asdescribed in Example III were queried against public databases such asthe GenBank primate, rodent, mammalian, vertebrate, and eukaryotedatabases using the BLAST program. The nearest GenBank protein homologwas then compared by BLAST analysis to either Incyte cDNA sequences orGenScan exon predicted sequences described in Example IV. A chimericprotein was generated by using the resultant high-scoring segment pairs(HSPs) to map the translated sequences onto the GenBank protein homolog.Insertions or deletions may occur in the chimeric protein with respectto the original GenBank protein homolog. The GenBank protein homolog,the chimeric protein, or both were used as probes to search forhomologous genomic sequences from the public human genome databases.Partial DNA sequences were therefore “stretched” or extended by theaddition of homologous genomic sequences. The resultant stretchedsequences were examined to determine whether it contained a completegene.

[0326] VI. Chromosomal Mapping of TRICH Encoding Polynucleotides

[0327] The sequences which were used to assemble SEQ ID NO:21-40 werecompared with sequences from the Incyte LIFESEQ database and publicdomain databases using BLAST and other implementations of theSmith-Waterman algorithm. Sequences from these databases that matchedSEQ ID NO:21-40 were assembled into clusters of contiguous andoverlapping sequences using assembly algorithms such as Phrap (Table 7).Radiation hybrid and genetic mapping data available from publicresources such as the Stanford Human Genome Center (SHGC), WhiteheadInstitute for Genome Research (WIGR), and Généthon were used todetermine if any of the clustered sequences had been previously mapped.Inclusion of a mapped sequence in a cluster resulted in the assignmentof all sequences of that cluster, including its particular SEQ ID NO:,to that map location.

[0328] Map locations are represented by ranges, or intervals, of humanchromosomes. The map position of an interval, in centiMorgans, ismeasured relative to the terminus of the chromosome's p-arm. (ThecentiMorgan (cM) is a unit of measurement based on recombinationfrequencies between chromosomal markers. On average, 1 cM is roughlyequivalent to 1 megabase (Mb) of DNA in humans, although this can varywidely due to hot and cold spots of recombination.) The cM distances arebased on genetic markers mapped by Généthon which provide boundaries forradiation hybrid markers whose sequences were included in each of theclusters. Human genome maps and other resources available to the public,such as the NCBI “GeneMap'99” World Wide Web site(http://www.ncbi.nlm.nih.gov/genemap/), can be employed to determine ifpreviously identified disease genes map within or in proximity to theintervals indicated above.

[0329] VII. Analysis of Polynucleotide Expression

[0330] Northern analysis is a laboratory technique used to detect thepresence of a transcript of a gene and involves the hybridization of alabeled nucleotide sequence to a membrane on which RNAs from aparticular cell type or tissue have been bound. (See, e.g., Sambrook,supra, ch. 7; Ausubel (1995) supra, ch. 4 and 16.)

[0331] Analogous computer techniques applying BLAST were used to searchfor identical or related molecules in cDNA databases such as GenBank orLIFESEQ (Incyte Genomics). This analysis is much faster than multiplemembrane-based hybridizations. In addition, the sensitivity of thecomputer search can be modified to determine whether any particularmatch is categorized as exact or similar. The basis of the search is theproduct score, which is defined as:$\frac{{BLAST}\quad {Score} \times {Percent}\quad {Identity}}{5 \times {minimum}\quad \left\{ {{{length}\left( {{Seq}.\quad 1} \right)},{{length}\left( {{Seq}.\quad 2} \right)}} \right\}}$

[0332] The product score takes into account both the degree ofsimilarity between two sequences and the length of the sequence match.The product score is a normalized value between 0 and 100, and iscalculated as follows: the BLAST score is multiplied by the percentnucleotide identity and the product is divided by (5 times the length ofthe shorter of the two sequences). The BLAST score is calculated byassigning a score of +5 for every base that matches in a high-scoringsegment pair (HSP), and −4 for every mismatch. Two sequences may sharemore than one HSP (separated by gaps). If there is more than one HSP,then the pair with the highest BLAST score is used to calculate theproduct score. The product score represents a balance between fractionaloverlap and quality in a BLAST alignment. For example, a product scoreof 100 is produced only for 100% identity over the entire length of theshorter of the two sequences being compared. A product score of 70 isproduced either by 100% identity and 70% overlap at one end, or by 88%identity and 100% overlap at the other. A product score of 50 isproduced either by 100% identity and 50% overlap at one end, or 79%identity and 100% overlap.

[0333] Alternatively, polynucleotide sequences encoding TRICH areanalyzed with respect to the tissue sources from which they werederived. For example, some full length sequences are assembled, at leastin part, with overlapping Incyte cDNA sequences (see Example III). EachcDNA sequence is derived from a cDNA library constructed from a humantissue. Each human tissue is classified into one of the followingorgan/tissue categories: cardiovascular system; connective tissue;digestive system; embryonic structures; endocrine system; exocrineglands; genitalia, female; genitalia, male; germ cells; hemic and immunesystem; liver; musculoskeletal system; nervous system; pancreas;respiratory system; sense organs; skin; stomatognathic system;unclassified/mixed; or urinary tract. The number of libraries in eachcategory is counted and divided by the total number of libraries acrossall categories. Similarly, each human tissue is classified into one ofthe following disease/condition categories: cancer, cell line,developmental, inflammation, neurological, trauma, cardiovascular,pooled, and other, and the number of libraries in each category iscounted and divided by the total number of libraries across allcategories. The resulting percentages reflect the tissue- anddisease-specific expression of cDNA encoding TRICH. cDNA sequences andcDNA library/tissue information are found in the LIFESEQ GOLD database(Incyte Genomics, Palo Alto Calif.).

[0334] VIII. Extension of TRICH Encoding Polynucleotides

[0335] Full length polynucleotide sequences were also produced byextension of an appropriate fragment of the full length molecule usingoligonucleotide primers designed from this fragment. One primer wassynthesized to initiate 5′ extension of the known fragment, and theother primer was synthesized to initiate 3′ extension of the knownfragment. The initial primers were designed using OLIGO 4.06 software(National Biosciences), or another appropriate program, to be about 22to 30 nucleotides in length, to have a GC content of about 50% or more,and to anneal to the target sequence at temperatures of about 68° C. toabout 72° C. Any stretch of nucleotides which would result in hairpinstructures and primer-primer dimerizations was avoided.

[0336] Selected human cDNA libraries were used to extend the sequence.If more than one extension was necessary or desired, additional ornested sets of primers were designed.

[0337] High fidelity amplification was obtained by PCR using methodswell known in the art. PCR was performed in 96-well plates using thePTC-200 thermal cycler (MJ Research, Inc.). The reaction mix containedDNA template, 200 nmol of each primer, reaction buffer containing Mg²⁺,(NH₄)₂SO₄, and 2-mercaptoethanol, Taq DNA polymerase (Amersham PharmaciaBiotech), ELONGASE enzyme (Life Technologies), and Pfu DNA polymerase(Stratagene), with the following parameters for primer pair PCI A andPCI B: Step 1: 94° C., 3 min; Step 2: 94° C., 15 sec; Step 3: 60° C., 1min; Step 4: 68° C., 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times;Step 6: 68° C., 5 min; Step 7: storage at 4° C. In the alternative, theparameters for primer pair T7 and SK+ were as follows: Step 1: 94° C., 3min; Step 2: 94° C., 15 sec; Step 3: 57° C., 1 min; Step 4: 68° C., 2min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68° C., 5 min;Step 7: storage at 4° C.

[0338] The concentration of DNA in each well was determined bydispensing 100 μl PICOGREEN quantitation reagent (0.25% (v/v) PICOGREEN;Molecular Probes, Eugene Oreg.) dissolved in 1×TE and 0.5 μl ofundiluted PCR product into each well of an opaque fluorimeter plate(Corning Costar, Acton Mass.), allowing the DNA to bind to the reagent.The plate was scanned in a Fluoroskan II (Labsystems Oy, Helsinki,Finland) to measure the fluorescence of the sample and to quantify theconcentration of DNA. A 5 μl to 10 μl aliquot of the reaction mixturewas analyzed by electrophoresis on a 1% agarose gel to determine whichreactions were successful in extending the sequence.

[0339] The extended nucleotides were desalted and concentrated,transferred to 384-well plates, digested with CviJI cholera virusendonuclease (Molecular Biology Research, Madison Wis.), and sonicatedor sheared prior to religation into pUC 18 vector (Amersham PharmaciaBiotech). For shotgun sequencing, the digested nucleotides wereseparated on low concentration (0.6 to 0.8%) agarose gels, fragmentswere excised, and agar digested with Agar ACE (Promega). Extended cloneswere religated using T4 ligase (New England Biolabs, Beverly Mass.) intopUC 18 vector (Amersham Pharmacia Biotech), treated with Pfu DNApolymerase (Stratagene) to fill-in restriction site overhangs, andtransfected into competent E. coli cells. Transformed cells wereselected on antibiotic-containing media, and individual colonies werepicked and cultured overnight at 37° C. in 384-well plates in LB/2× carbliquid media.

[0340] The cells were lysed, and DNA was amplified by PCR using Taq DNApolymerase (Amersham Pharmacia Biotech) and Pfu DNA polymerase(Stratagene) with the following parameters: Step 1: 94° C., 3 min; Step2: 94° C., 15 sec; Step 3: 60° C., 1 min; Step 4: 72° C., 2 min; Step 5:steps 2, 3, and 4 repeated 29 times; Step 6: 72° C., 5 min; Step 7:storage at 4° C. DNA was quantified by PICOGREEN reagent (MolecularProbes) as described above. Samples with low DNA recoveries werereamplified using the same conditions as described above. Samples werediluted with 20% dimethysulfoxide (1:2, v/v), and sequenced usingDYENAMIC energy transfer sequencing primers and the DYENAMIC DIRECT kit(Amersham Pharmacia Biotech) or the ABI PRISM BIGDYE Terminator cyclesequencing ready reaction kit (Applied Biosystems).

[0341] In like manner, full length polynucleotide sequences are verifiedusing the above procedure or are used to obtain 5′ regulatory sequencesusing the above procedure along with oligonucleotides designed for suchextension, and an appropriate genomic library.

[0342] IX. Identification of Single Nucleotide Polymorphisms in TRICHEncoding P lynucleotides

[0343] Common DNA sequence variants known as single nucleotidepolymorphisms (SNPs) were identified in SEQ ID NO:21-40 using theLIFESEQ database (Incyte Genomics). Sequences from the same gene wereclustered together and assembled as described in Example III, allowingthe identification of all sequence variants in the gene. An algorithmconsisting of a series of filters was used to distinguish SNPs fromother sequence variants. Preliminary filters removed the majority ofbasecall errors by requiring a minimum Phred quality score of 15, andremoved sequence alignment errors and errors resulting from impropertrimming of vector sequences, chimeras, and splice variants. Anautomated procedure of advanced chromosome analysis analysed theoriginal chromatogram files in the vicinity of the putative SNP. Cloneerror filters used statistically generated algorithms to identify errorsintroduced during laboratory processing, such as those caused by reversetranscriptase, polymerase, or somatic mutation. Clustering error filtersused statistically generated algorithms to identify errors resultingfrom clustering of close homologs or pseudogenes, or due tocontamination by non-human sequences. A final set of filters removedduplicates and SNPs found in immunoglobulins or T-cell receptors.

[0344] Certain SNPs were selected for further characterization by massspectrometry using the high throughput MASSARRAY system (Sequenom, Inc.)to analyze allele frequencies at the SNP sites in four different humanpopulations. The Caucasian population comprised 92 individuals (46 male,46 female), including 83 from Utah, four French, three Venezualan, andtwo Amish individuals. The African population comprised 194 individuals(97 male, 97 female), all African Americans. The Hispanic populationcomprised 324 individuals (162 male, 162 female), all Mexican Hispanic.The Asian population comprised 126 individuals (64 male, 62 female) witha reported parental breakdown of 43% Chinese, 31% Japanese, 13% Korean,5% Vietnamese, and 8% other Asian. Allele frequencies were firstanalyzed in the Caucasian population; in some cases those SNPs whichshowed no allelic variance in this population were not further tested inthe other three populations.

[0345] X. Labeling and Use of Individual Hybridization Probes

[0346] Hybridization probes derived from SEQ ID NO:21-40 are employed toscreen cDNAs, genomic DNAs, or mRNAs. Although the labeling ofoligonucleotides, consisting of about 20 base pairs, is specificallydescribed, essentially the same procedure is used with larger nucleotidefragments. Oligonucleotides are designed using state-of-the-art softwaresuch as OLIGO 4.06 software (National Biosciences) and labeled bycombining 50 pmol of each oligomer, 250 μCi of [γ-³²P] adenosinetriphosphate (Amersham Pharmacia Biotech), and T4 polynucleotide kinase(DuPont NEN, Boston Mass.). The labeled oligonucleotides aresubstantially purified using a SEPHADEX G-25 superfine size exclusiondextran bead column (Amersham Pharmacia Biotech). An aliquot containing10⁷ counts per minute of the labeled probe is used in a typicalmembrane-based hybridization analysis of human genomic DNA digested withone of the following endonucleases: Ase I, Bgl II, Eco RI, Pst I, Xba I,or Pvu II (DuPont NEN).

[0347] The DNA from each digest is fractionated on a 0.7% agarose geland transferred to nylon membranes (Nytran Plus, Schleicher & Schuell,Durham N.H.). Hybridization is carried out for 16 hours at 40° C. Toremove nonspecific signals, blots are sequentially washed at roomtemperature under conditions of up to, for example, 0.1× saline sodiumcitrate and 0.5% sodium dodecyl sulfate. Hybridization patterns arevisualized using autoradiography or an alternative imaging means andcompared.

[0348] XI. Microarrays

[0349] The linkage or synthesis of array elements upon a microarray canbe achieved utilizing photolithography, piezoelectric printing (ink-jetprinting, See, e.g., Baldeschweiler, supra.), mechanical microspottingtechnologies, and derivatives thereof. The substrate in each of theaforementioned technologies should be uniform and solid with anon-porous surface (Schena (1999), supra). Suggested substrates includesilicon, silica, glass slides, glass chips, and silicon wafers.Alternatively, a procedure analogous to a dot or slot blot may also beused to arrange and link elements to the surface of a substrate usingthermal, UV, chemical, or mechanical bonding procedures. A typical arraymay be produced using available methods and machines well known to thoseof ordinary skill in the art and may contain any appropriate number ofelements. (See, e.g., Schena, M. et al. (1995) Science 270:467-470;Shalon, D. et al. (1996) Genome Res. 6:639-645; Marshall, A. and J.Hodgson (1998) Nat. Biotechnol. 16:27-31.)

[0350] Full length cDNAs, Expressed Sequence Tags (ESTs), or fragmentsor oligomers thereof may comprise the elements of the microarray.Fragments or oligomers suitable for hybridization can be selected usingsoftware well known in the art such as LASERGENE software (DNASTAR). Thearray elements are hybridized with polynucleotides in a biologicalsample. The polynucleotides in the biological sample are conjugated to afluorescent label or other molecular tag for ease of detection. Afterhybridization, nonhybridized nucleotides from the biological sample areremoved, and a fluorescence scanner is used to detect hybridization ateach array element. Alternatively, laser desorbtion and massspectrometry may be used for detection of hybridization. The degree ofcomplementarity and the relative abundance of each polynucleotide whichhybridizes to an element on the microarray may be assessed. In oneembodiment, microarray preparation and usage is described in detailbelow.

[0351] Tissue or Cell Sample Preparation

[0352] Total RNA is isolated from tissue samples using the guanidiniumthiocyanate method and poly(A)⁺ RNA is purified using the oligo-(dT)cellulose method. Each poly(A)⁺ RNA sample is reverse transcribed usingMMLV reverse-transcriptase, 0.05 pg/μl oligo-(dT) primer (21 mer), 1×first strand buffer, 0.03 units/μl RNase inhibitor, 500 μM dATP, 500 μMdGTP, 500 μM dTTP, 40 μM dCTP, 40 μM dCTP-Cy3 (BDS) or dCTP-Cy5(Amersham Pharmacia Biotech). The reverse transcription reaction isperformed in a 25 ml volume containing 200 ng poly(A)⁺ RNA withGEMBRIGHT kits (Incyte). Specific control poly(A)⁺ RNAs are synthesizedby in vitro transcription from non-coding yeast genomic DNA. Afterincubation at 37° C. for 2 hr, each reaction sample (one with Cy3 andanother with Cy5 labeling) is treated with 2.5 ml of 0.5M sodiumhydroxide and incubated for 20 minutes at 85° C. to the stop thereaction and degrade the RNA. Samples are purified using two successiveCHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc.(CLONTECH), Palo Alto Calif.) and after combining, both reaction samplesare ethanol precipitated using 1 ml of glycogen (1 mg/ml), 60 ml sodiumacetate, and 300 ml of 100% ethanol. The sample is then dried tocompletion using a SpeedVAC (Savant Instruments Inc., Holbrook N.Y.) andresuspended in 14 μl 5×SSC/0.2% SDS.

[0353] For SEQ ID NO:36, for example, HMECs, which are a primary humanbreast epithelial cell line isolated from a normal donor, were grown inMammary Epithelial Cell Growth Medium (Clonetics, Walkersville Md.)supplemented with 10 ng/ml human recombinant epidermal growth factor, 5mg/ml insulin, 0.5 mg/ml hydrocortisone, 50 mg/ml gentamicin, 50 ng/mlamphotericin-B, and 0.5 mg/ml bovine pituitary extract. Cells were grownto 70-80% confluence prior to harvesting. About 1×10⁷ cells wereharvested at passage 8 (progenitor cells), passages 10 and 12(progressively senescent cells), passage 14 (presenescent cells), andpassage 15 (senescent cells). In this manner, it was demonstrated thatthe expression in senescent cells of component 2812176 of SEQ ID NO:36is increased by a factor of at least 2.

[0354] Microarray Preparation

[0355] Sequences of the present invention are used to generate arrayelements. Each array element is amplified from bacterial cellscontaining vectors with cloned cDNA inserts. PCR amplification usesprimers complementary to the vector sequences flanking the cDNA insert.Array elements are amplified in thirty cycles of PCR from an initialquantity of 1-2 ng to a final quantity greater than 5 μg. Amplifiedarray elements are then purified using SEPHACRYL-400 (Amersham PharmaciaBiotech).

[0356] Purified array elements are immobilized on polymer-coated glassslides. Glass microscope slides (Corning) are cleaned by ultrasound in0.1% SDS and acetone, with extensive distilled water washes between andafter treatments. Glass slides are etched in 4% hydrofluoric acid (VWRScientific Products Corporation (VWR), West Chester Pa.), washedextensively in distilled water, and coated with 0.05% aminopropyl silane(Sigma) in 95% ethanol. Coated slides are cured in a 110° C. oven.

[0357] Array elements are applied to the coated glass substrate using aprocedure described in U.S. Pat. No. 5,807,522, incorporated herein byreference. 1 μl of the array element DNA, at an average concentration of100 ng/μl, is loaded into the open capillary printing element by ahigh-speed robotic apparatus. The apparatus then deposits about 5 nl ofarray element sample per slide.

[0358] Microarrays are UV-crosslinked using a STRATALINKERUV-crosslinker (Stratagene). Microarrays are washed at room temperatureonce in 0.2% SDS and three times in distilled water. Non-specificbinding sites are blocked by incubation of microarrays in 0.2% casein inphosphate buffered saline (PBS) (Tropix, Inc., Bedford Mass.) for 30minutes at 60° C. followed by washes in 0.2% SDS and distilled water asbefore.

[0359] Hybridization

[0360] Hybridization reactions contain 9 μl of sample mixture consistingof 0.2 μg each of Cy3 and Cy5 labeled cDNA synthesis products in 5×SSC,0.2% SDS hybridization buffer. The sample mixture is heated to 65° C.for 5 minutes and is aliquoted onto the microarray surface and coveredwith an 1.8 cm² coverslip. The arrays are transferred to a waterproofchamber having a cavity just slightly larger than a microscope slide.The chamber is kept at 100% humidity internally by the addition of 140μl of 5×SSC in a corner of the chamber. The chamber containing thearrays is incubated for about 6.5 hours at 60° C. The arrays are washedfor 10 min at 45° C. in a first wash buffer (1×SSC, 0.1% SDS), threetimes for 10 minutes each at 45° C. in a second wash buffer (0.1×SSC),and dried.

[0361] Detection

[0362] Reporter-labeled hybridization complexes are detected with amicroscope equipped with an Innova 70 mixed gas 10 W laser (Coherent,Inc., Santa Clara Calif.) capable of generating spectral lines at 488 nmfor excitation of Cy3 and at 632 nm for excitation of Cy5. Theexcitation laser light is focused on the array using a 20× microscopeobjective (Nikon, Inc., Melville N.Y.). The slide containing the arrayis placed on a computer-controlled X-Y stage on the microscope andraster-scanned past the objective. The 1.8 cm×1.8 cm array used in thepresent example is scanned with a resolution of 20 micrometers.

[0363] In two separate scans, a mixed gas multiline laser excites thetwo fluorophores sequentially. Emitted light is split, based onwavelength, into two photomultiplier tube detectors (PMT R1477,Hamamatsu Photonics Systems, Bridgewater N.J.) corresponding to the twofluorophores. Appropriate filters positioned between the array and thephotomultiplier tubes are used to filter the signals. The emissionmaxima of the fluorophores used are 565 nm for Cy3 and 650 nm for Cy5.Each array is typically scanned twice, one scan per fluorophore usingthe appropriate filters at the laser source, although the apparatus iscapable of recording the spectra from both fluorophores simultaneously.

[0364] The sensitivity of the scans is typically calibrated using thesignal intensity generated by a cDNA control species added to the samplemixture at a known concentration. A specific location on the arraycontains a complementary DNA sequence, allowing the intensity of thesignal at that location to be correlated with a weight ratio ofhybridizing species of 1:100,000. When two samples from differentsources (e.g., representing test and control cells), each labeled with adifferent fluorophore, are hybridized to a single array for the purposeof identifying genes that are differentially expressed, the calibrationis done by labeling samples of the calibrating cDNA with the twofluorophores and adding identical amounts of each to the hybridizationmixture.

[0365] The output of the photomultiplier tube is digitized using a12-bit RTI-835H analog-to-digital (A/D) conversion board (AnalogDevices, Inc., Norwood Mass.) installed in an IBM-compatible PCcomputer. The digitized data are displayed as an image where the signalintensity is mapped using a linear 20-color transformation to apseudocolor scale ranging from blue (low signal) to red (high signal).The data is also analyzed quantitatively. Where two differentfluorophores are excited and measured simultaneously, the data are firstcorrected for optical crosstalk (due to overlapping emission spectra)between the fluorophores using each fluorophore's emission spectrum.

[0366] A grid is superimposed over the fluorescence signal image suchthat the signal from each spot is centered in each element of the grid.The fluorescence signal within each element is then integrated to obtaina numerical value corresponding to the average intensity of the signal.The software used for signal analysis is the GEMTOOLS gene expressionanalysis program (Incyte).

[0367] XII. Complementary Polynucleotides

[0368] Sequences complementary to the TRICH-encoding sequences, or anyparts thereof, are used to detect, decrease, or inhibit expression ofnaturally occurring TRICH. Although use of oligonucleotides comprisingfrom about 15 to 30 base pairs is described, essentially the sameprocedure is used with smaller or with larger sequence fragments.Appropriate oligonucleotides are designed using OLIGO 4.06 software(National Biosciences) and the coding sequence of TRICH. To inhibittranscription, a complementary oligonucleotide is designed from the mostunique 5′ sequence and used to prevent promoter binding to the codingsequence. To inhibit translation, a complementary oligonucleotide isdesigned to prevent ribosomal binding to the TRICH-encoding transcript.

[0369] XIII. Expression of TRICH

[0370] Expression and purification of TRICH is achieved using bacterialor virus-based expression systems. For expression of TRICH in bacteria,cDNA is subcloned into an appropriate vector containing an antibioticresistance gene and an inducible promoter that directs high levels ofcDNA transcription. Examples of such promoters include, but are notlimited to, the trp-lac (tac) hybrid promoter and the T5 or T7bacteriophage promoter in conjunction with the lac operator regulatoryelement. Recombinant vectors are transformed into suitable bacterialhosts, e.g., BL21(DE3). Antibiotic resistant bacteria express TRICH uponinduction with isopropyl beta-D-thiogalactopyranoside (IPTG). Expressionof TRICH in eukaryotic cells is achieved by infecting insect ormammalian cell lines with recombinant Autographica californica nuclearpolyhedrosis virus (AcMNPV), commonly known as baculovirus. Thenonessential polyhedrin gene of baculovirus is replaced with cDNAencoding TRICH by either homologous recombination or bacterial-mediatedtransposition involving transfer plasmid intermediates. Viralinfectivity is maintained and the strong polyhedrin promoter drives highlevels of cDNA transcription. Recombinant baculovirus is used to infectSpodoptera frugiperda (Sf9) insect cells in most cases, or humanhepatocytes, in some cases. Infection of the latter requires additionalgenetic modifications to baculovirus. (See Engelhard, E. K. et al.(1994) Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996)Hum. Gene Ther. 7:1937-1945.)

[0371] In most expression systems, TRICH is synthesized as a fusionprotein with, e.g., glutathione S-transferase (GST) or a peptide epitopetag, such as FLAG or 6-His, permitting rapid, single-step,affinity-based purification of recombinant fusion protein from crudecell lysates. GST, a 26-kilodalton enzyme from Schistosoma japonicum,enables the purification of fusion proteins on immobilized glutathioneunder conditions that maintain protein activity and antigenicity(Amersham Pharmacia Biotech). Following purification, the GST moiety canbe proteolytically cleaved from TRICH at specifically engineered sites.FLAG, an 8-amino acid peptide, enables immunoaffinity purification usingcommercially available monoclonal and polyclonal anti-FLAG antibodies(Eastman Kodak). 6-His, a stretch of six consecutive histidine residues,enables purification on metal-chelate resins (QIAGEN). Methods forprotein expression and purification are discussed in Ausubel (1995,supra, ch. 10 and 16). Purified TRICH obtained by these methods can beused directly in the assays shown in Examples XVII, XVIII, and XIX,where applicable.

[0372] XIV. Functional Assays

[0373] TRICH function is assessed by expressing the sequences encodingTRICH at physiologically elevated levels in mammalian cell culturesystems. cDNA is subcloned into a mammalian expression vector containinga strong promoter that drives high levels of cDNA expression. Vectors ofchoice include PCMV SPORT (Life Technologies) and PCR3.1 (Invitrogen,Carlsbad Calif.), both of which contain the cytomegalovirus promoter.5-10 μg of recombinant vector are transiently transfected into a humancell line, for example, an endothelial or hematopoietic cell line, usingeither liposome formulations or electroporation. 1-2 μg of an additionalplasmid containing sequences encoding a marker protein areco-transfected. Expression of a marker protein provides a means todistinguish transfected cells from nontransfected cells and is areliable predictor of cDNA expression from the recombinant vector.Marker proteins of choice include, e.g., Green Fluorescent Protein (GFP;Clontech), CD64, or a CD64-GFP fusion protein. Flow cytometry (FCM), anautomated, laser optics-based technique, is used to identify transfectedcells expressing GFP or CD64-GFP and to evaluate the apoptotic state ofthe cells and other cellular properties. FCM detects and quantifies theuptake of fluorescent molecules that diagnose events preceding orcoincident with cell death. These events include changes in nuclear DNAcontent as measured by staining of DNA with propidium iodide; changes incell size and granularity as measured by forward light scatter and 90degree side light scatter; down-regulation of DNA synthesis as measuredby decrease in bromodeoxyuridine uptake; alterations in expression ofcell surface and intracellular proteins as measured by reactivity withspecific antibodies; and alterations in plasma membrane composition asmeasured by the binding of fluorescein-conjugated Annexin V protein tothe cell surface. Methods in flow cytometry are discussed in Ormerod, M.G. (1994) Flow Cytometry, Oxford, New York N.Y.

[0374] The influence of TRICH on gene expression can be assessed usinghighly purified populations of cells transfected with sequences encodingTRICH and either CD64 or CD64-GFP. CD64 and CD64-GFP are expressed onthe surface of transfected cells and bind to conserved regions of humanimmunoglobulin G (IgG). Transfected cells are efficiently separated fromnontransfected cells using magnetic beads coated with either human IgGor antibody against CD64 (DYNAL, Lake Success N.Y.). mRNA can bepurified from the cells using methods well known by those of skill inthe art. Expression of mRNA encoding TRICH and other genes of interestcan be analyzed by northern analysis or microarray techniques.

[0375] XV. Production of TRICH Specific Antibodies

[0376] TRICH substantially purified using polyacrylamide gelelectrophoresis (PAGE; see, e.g., Harrington, M. G. (1990) MethodsEnzymol. 182:488-495), or other purification techniques, is used toimmunize animals (e.g., rabbits, mice, etc.) and to produce antibodiesusing standard protocols.

[0377] Alternatively, the TRICH amino acid sequence is analyzed usingLASERGENE software (DNASTAR) to determine regions of highimmunogenicity, and a corresponding oligopeptide is synthesized and usedto raise antibodies by means known to those of skill in the art. Methodsfor selection of appropriate epitopes, such as those near the C-terminusor in hydrophilic regions are well described in the art. (See, e.g.,Ausubel, 1995, supra, ch. 11.)

[0378] Typically, oligopeptides of about 15 residues in length aresynthesized using an ABI 431A peptide synthesizer (Applied Biosystems)using FMOC chemistry and coupled to KLH (Sigma-Aldrich, St. Louis Mo.)by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) toincrease immunogenicity. (See, e.g., Ausubel, 1995, supra.) Rabbits areimmunized with the oligopeptide-KLH complex in complete Freund'sadjuvant. Resulting antisera are tested for antipeptide and anti-TRICHactivity by, for example, binding the peptide or TRICH to a substrate,blocking with 1% BSA, reacting with rabbit antisera, washing, andreacting with radio-iodinated goat anti-rabbit IgG.

[0379] XVI. Purification of Naturally Occurring TRICH Using SpecificAntibodies

[0380] Naturally occurring or recombinant TRICH is substantiallypurified by immunoaffinity chromatography using antibodies specific forTRICH. An immunoaffinity column is constructed by covalently couplinganti-TRICH antibody to an activated chromatographic resin, such asCNBr-activated SEPHAROSE (Amersham Pharmacia Biotech). After thecoupling, the resin is blocked and washed according to themanufacturer's instructions.

[0381] Media containing TRICH are passed over the immunoaffinity column,and the column is washed under conditions that allow the preferentialabsorbance of TRICH (e.g., high ionic strength buffers in the presenceof detergent). The column is eluted under conditions that disruptantibody/TRICH binding (e.g., a buffer of pH 2 to pH 3, or a highconcentration of a chaotrope, such as urea or thiocyanate ion), andTRICH is collected.

[0382] XVII. Identification of Molecules Which Interact with TRICH

[0383] TRICH, or biologically active fragments thereof, are labeled with¹²⁵I Bolton-Hunter reagent. (See, e.g., Bolton, A. E. and W. M. Hunter(1973) Biochem. J. 133:529-539.) Candidate molecules previously arrayedin the wells of a multi-well plate are incubated with the labeled TRICH,washed, and any wells with labeled TRICH complex are assayed. Dataobtained using different concentrations of TRICH are used to calculatevalues for the number, affinity, and association of TRICH with thecandidate molecules.

[0384] Alternatively, molecules interacting with TRICH are analyzedusing the yeast two-hybrid system as described in Fields, S. and O. Song(1989) Nature 340:245-246, or using commercially available kits based onthe two-hybrid system, such as the MATCHMAKER system (Clontech).

[0385] TRICH may also be used in the PATHCALLING process (CuraGen Corp.,New Haven Conn.) which employs the yeast two-hybrid system in ahigh-throughput manner to determine all interactions between theproteins encoded by two large libraries of genes (Nandabalan, K. et al.(2000) U.S. Pat. No. 6,057,101).

[0386] XVII. Identification of Molecules Which Interact with TRICH

[0387] Molecules which interact with TRICH may include transportersubstrates, agonists or antagonists, modulatory proteins such as Gβγproteins (Reimann, supra) or proteins involved in TRICH localization orclustering such as MAGUKs (Craven, supra). TRICH, or biologically activefragments thereof, are labeled with ¹²⁵I Bolton-Hunter reagent. (See,e.g., Bolton A. E. and W. M. Hunter (1973) Biochem. J. 133:529-539.)Candidate molecules previously arrayed in the wells of a multi-wellplate are incubated with the labeled TRICH, washed, and any wells withlabeled TRICH complex are assayed. Data obtained using differentconcentrations of TRICH are used to calculate values for the number,affinity, and association of TRICH with the candidate molecules.

[0388] Alternatively, proteins that interact with TRICH are isolatedusing the yeast 2-hybrid system (Fields, S. and O. Song (1989) Nature340:245-246). TRICH, or fragments thereof, are expressed as fusionproteins with the DNA binding domain of Gal4 or lexA, and potentialinteracting proteins are expressed as fusion proteins with an activationdomain. Interactions between the TRICH fusion protein and the TRICHinteracting proteins (fusion proteins with an activation domain)reconstitute a transactivation function that is observed by expressionof a reporter gene. Yeast 2-hybrid systems are commercially available,and methods for use of the yeast 2-hybrid system with ion channelproteins are discussed in Niethammer, M. and M. Sheng (1998, Meth.Enzymol. 293:104-122).

[0389] TRICH may also be used in the PATHCALLING process (CuraGen Corp.,New Haven Conn.) which employs the yeast two-hybrid system in ahigh-throughput manner to determine all interactions between theproteins encoded by two large libraries of genes (Nandabalan, K. et al.(2000) U.S. Pat. No. 6,057,101).

[0390] Potential TRICH agonists or antagonists may be tested foractivation or inhibition of TRICH ion channel activity using the assaysdescribed in section XVIII.

[0391] XVIII. Demonstration of TRICH Activity

[0392] Ion channel activity of TRICH is demonstrated using anelectrophysiological assay for ion conductance. TRICH can be expressedby transforming a mammalian cell line such as COS7, HeLa or CHO with aeukaryotic expression vector encoding TRICH. Eukaryotic expressionvectors are commercially available, and the techniques to introduce theminto cells are well known to those skilled in the art. A second plasmidwhich expresses any one of a number of marker genes, such asβ-galactosidase, is co-transformed into the cells to allow rapididentification of those cells which have taken up and expressed theforeign DNA. The cells are incubated for 48-72 hours aftertransformation under conditions appropriate for the cell line to allowexpression and accumulation of TRICH and β-galactosidase.

[0393] Transformed cells expressing β-galactosidase are stained bluewhen a suitable colorimetric substrate is added to the culture mediaunder conditions that are well known in the art. Stained cells aretested for differences in membrane conductance by electrophysiologicaltechniques that are well known in the art. Untransformed cells, and/orcells transformed with either vector sequences alone or β-galactosidasesequences alone, are used as controls and tested in parallel. Cellsexpressing TRICH will have higher anion or cation conductance relativeto control cells. The contribution of TRICH to conductance can beconfirmed by incubating the cells using antibodies specific for TRICH.The antibodies will bind to the extracellular side of TRICH, therebyblocking the pore in the ion channel, and the associated conductance.

[0394] Alternatively, ion channel activity of TRICH is measured ascurrent flow across a TRICH-containing Xenopus laevis oocyte membraneusing the two-electrode voltage-clamp technique (Ishi et al., supra;Jegla, T. and L. Salkoff (1997) J. Neurosci. 17:32-44). TRICH issubcloned into an appropriate Xenopus oocyte expression vector, such aspBF, and 0.5-5 ng of mRNA is injected into mature stage IV oocytes.Injected oocytes are incubated at 18° C. for 1-5 days. Inside-outmacropatches are excised into an intracellular solution containing 116mM K-gluconate, 4 mM KCl, and 10 mM Hepes (pH 7.2). The intracellularsolution is supplemented with varying concentrations of the TRICHmediator, such as cAMP, cGMP, or Ca⁺² (in the form of CaCl₂), whereappropriate. Electrode resistance is set at 2-5 MΩ and electrodes arefilled with the intracellular solution lacking mediator. Experiments areperformed at room temperature from a holding potential of 0 mV. Voltageramps (2.5 s) from −100 to 100 mV are acquired at a sampling frequencyof 500 Hz. Current measured is proportional to the activity of TRICH inthe assay.

[0395] Transport activity of TRICH is assayed by measuring uptake oflabeled substrates into Xenopus laevis oocytes. Oocytes at stages V andVI are injected with TRICH mRNA (10 ng per oocyte) and incubated for 3days at 18° C. in OR2 medium (82.5 mM NaCl, 2.5 mM KCl, 1 mM CaCl₂, 1 mMMgCl₂, 1 mM Na₂HPO₄, 5 mM Hepes, 3.8 mM NaOH, 50 μg/ml gentamycin, pH7.8) to allow expression of TRICH. Oocytes are then transferred tostandard uptake medium (100 mM NaCl, 2 mM KCl, 1 mM CaCl₂, 1 mM MgCl₂,10 mM Hepes/Tris pH 7.5). Uptake of various substrates (e.g., aminoacids, sugars, drugs, ions, and neurotransmitters) is initiated byadding labeled substrate (e.g. radiolabeled with ³H, fluorescentlylabeled with rhodamine, etc.) to the oocytes. After incubating for 30minutes, uptake is terminated by washing the oocytes three times inNa⁺-free medium, measuring the incorporated label, and comparing withcontrols. TRICH activity is proportional to the level of internalizedlabeled substrate. In particular, test substrates include glucose andother sugars for TRICH-1, aminophospholipids for TRICH-2, HCO³⁻ forTRICH-3, sulfate and other anions for TRICH-4, nucleotides for TRICH-5,Na⁺ and bile acids for TRICH-6, TRICH-8, cationic amino acids forTRICH-11, amino acids for TRICH-7, protons for TRICH-9, drugs forTRICH-12, bile acids for TRICH-13 and TRICH-17, nucleosides forTRICH-15, drugs and other xenobiotics for TRICH-16, andneurotransmitters or organic osmolytes for TRICH-18.

[0396] ATPase activity associated with TRICH can be measured byhydrolysis of radiolabeled ATP-[γ-³²P], separation of the hydrolysisproducts by chromatographic methods, and quantitation of the recovered³²P using a scintillation counter. The reaction mixture containsATP-[γ-³²P] and varying amounts of TRICH in a suitable buffer incubatedat 37° C. for a suitable period of time. The reaction is terminated byacid precipitation with trichloroacetic acid and then neutralized withbase, and an aliquot of the reaction mixture is subjected to membrane orfilter paper-based chromatography to separate the reaction products. Theamount of ³²P liberated is counted in a scintillation counter. Theamount of radioactivity recovered is proportional to the ATPase activityof TRICH in the assay.

[0397] Lipocalin activity of TRICH is measured by ligand fluorescenceenhancement spectrofluorometry (Lin et al. (1997) Molecular Vision3:17). Examples of ligands include retinol (Sigma, St. Louis Mo.) and16-anthryloxy-palmitic acid (16-AP) (Molecular Probes Inc., EugeneOreg.). Ligand is dissolved in 100% ethanol and its concentration isestimated using known extinction coefficents (retinol: 46,000 A/M/cm at325 nm; 16-AP: 8,200 A/M/cm at 361 nm). A 700 μl aliquot of 1 μM TRICHin 10 mM Tris (pH 7.5), 2 mM EDTA, and 500 mM NaCl is placed in a 1 cmpath length quartz cuvette and 1 μl aliquots of ligand solution areadded. Fluorescence is measured 100 seconds after each addition untilreadings are stable. Change in fluorescence per unit change in ligandconcentration is proportional to TRICH activity.

[0398] In particular, the activity of TRICH-10 is measured as Ca²⁺conductance, the activity of TRICH-14 is measured as K⁺ conductance andthe activity of TRICH-19 is measured as calcium-activated K+conductance.

[0399] XIX. Identification of TRICH Agonists and Antagonists

[0400] TRICH is expressed in a eukaryotic cell line such as CHO (ChineseHamster Ovary) or HEK (Human Embryonic Kidney) 293. Ion channel activityof the transformed cells is measured in the presence and absence ofcandidate agonists or antagonists. Ion channel activity is assayed usingpatch clamp methods well known in the art or as described in ExampleXVIII. Alternatively, ion channel activity is assayed using fluorescenttechniques that measure ion flux across the cell membrane (Velicelebi,G. et al. (1999) Meth. Enzymol. 294:20-47; West, M. R. and C. R. Molloy(1996) Anal. Biochem. 241:51-58). These assays may be adapted forhigh-throughput screening using microplates. Changes in internal ionconcentration are measured using fluorescent dyes such as the Ca²⁺indicator Fluo-4 AM, sodium-sensitive dyes such as SBFI and sodiumgreen, or the Cl⁻ indicator MQAE (all available from Molecular Probes)in combination with the FLIPR fluorimetric plate reading system(Molecular Devices). In a more generic version of this assay, changes inmembrane potential caused by ionic flux across the plasma membrane aremeasured using oxonyl dyes such as DiBAC₄ (Molecular Probes). DiBAC₄equilibrates between the extracellular solution and cellular sitesaccording to the cellular membrane potential. The dye's fluorescenceintensity is 20-fold greater when bound to hydrophobic intracellularsites, allowing detection of DiBAC₄ entry into the cell (Gonzalez, J. E.and P. A. Negulescu (1998) Curr. Opin. Biotechnol. 9:624-631). Candidateagonists or antagonists may be selected from known ion channel agonistsor antagonists, peptide libraries, or combinatorial chemical libraries.

[0401] Various modifications and variations of the described methods andsystems of the invention will be apparent to those skilled in the artwithout departing from the scope and spirit of the invention. Althoughthe invention has been described in connection with certain embodiments,it should be understood that the invention as claimed should not beunduly limited to such specific embodiments. Indeed, variousmodifications of the described modes for carrying out the inventionwhich are obvious to those skilled in molecular biology or relatedfields are intended to be within the scope of the following claims.TABLE 1 Poly- peptide Poly- Incyte SEQ ID Incyte nucleotide IncyteProject ID NO: Polypeptide ID SEQ ID NO: Polynucleotide ID 6911460 16911460CD1 21 6911460CB1 55138203  2 55138203CD1  22 55138203CB1 7478871 3 7478871CD1 23 7478871CB1 7483601 4 7483601CD1 24 7483601CB17487851 5 7487851CD1 25 7487851CB1 7472881 6 7472881CD1 26 7472881CB17612560 7 7612560CD1 27 7612560CB1 2880370 8 2880370CD1 28 2880370CB16267489 9 6267489CD1 29 6267489CB1 7484777 10 7484777CD1 30 7484777CB12493969 11 2493969CD1 31 2493969CB1 3244593 12 3244593CD1 32 3244593CB14921451 13 4921451CD1 33 4921451CB1 5547443 14 5547443CD1 34 5547443CB156008413  15 56008413CD1  35 56008413CB1  6127911 16 6127911CD1 366127911CB1 6427133 17 6427133CD1 37 6427133CB1 7472932 18 7472932CD1 387472932CB1 8463147 19 8463147CD1 39 8463147CB1 7506408 20 7506408CD1 407506408CB1

[0402] TABLE 2 GenBank ID NO: Polypeptide Incyte or PROTEOME ProbabilitySEQ ID NO: Polypeptide ID ID NO: Score Annotation 1 6911460CD1 g1453212.30E−65 [Escherichia coli] arabinose-proton symporter Maiden, M. C. J.et al. (1988) J. Biol. Chem. 263: 8003-8010 2 55138203CD1  g4972583 0[Homo sapiens] ATPase II Mouro, I. et al. (1999) Biochem. Biophys. Res.Commun. 257: 333-339 3 7478871CD1 g11611537 0 [Oryctolagus cuniculus]anion exchanger 4a Tsuganezawa, H. et al. (2000) J. Biol. Chem. 276:8180-8189 4 7483601CD1 g8050590 6.30E−258 [Meriones unguiculatus]prestin Zheng, J. et al. (2000) Nature 405: 149-155 5 7487851CD1g1002424 2.40E−249 [Mus musculus] YSPL-1 (yolk sac permease-likemolecule 1) form 1 Guimaraes, M. J. et al. (1995) Development 121:3335-3346 6 7472881CD1 g455033 3.70E−88 [Cricetulus griseus] Na+dependent ileal bile acid transporter Wong, M. H. et al. (1994) J. Biol.Chem. 269: 1340-1347 7 7612560CD1 g14571904 0 [Rattus norvegicus]lysosomal amino acid transporter 1 Sagne, C. et al. (2001) Proc. Natl.Acad. Sci. U.S.A. 98: 7206-7211 8 2880370CD1 g455033 3.10E−36[Cricetulus griseus] Na+ dependent ileal bile acid transporter Wong, M.H. et al. (1994) supra 9 6267489CD1 g1226235 3.20E−130 [Mus musculus]Ac39/physophilin Carrion-Vazquez, M. et al. (1998) Eur. J. Neurosci. 10:1153-66 10 7484777CD1 g3243075 0 [Homo sapiens] melastatin 1 Hunter, J.J. et al. (1998) Genomics 54: 116-123 Duncan, L. M. et al. (2001) J.Clin. Oncol. 19: 568-576 11 2493969CD1 g1589917 3.20E−137 [Rattusnorvegicus] cationic amino acid transporter-1 Aulak, K. S. et al. (1996)J. Biol. Chem. 271: 29799-29806 12 3244593CD1 g6682827 3.50E−236 [Rattusnorvegicus] multidrug resistance protein (MRP5) 13 4921451CD1 g36287572.70E−257 [Homo sapiens] FIC1 Bull, L. N. et al. (1998) Cholestasis.Nat. Genet. 18: 219-224

[0403] TABLE 2 GenBank ID NO: Polypeptide Incyte or PROTEOME ProbabilitySEQ ID NO: Polypeptide ID ID NO: Score Annotation 15 56008413CD1 g8698687 5.10E−29 [Mus musculus] equilibrativenitrobenzylthioinosine-insensitive nucleoside transporter ENT2 Kiss, A.et al. (2000) Biochem. J. 352: 363-372 16 6127911CD1 g17223626 0 [Homosapiens] ATP-binding cassette A10 17 6427133CD1 g3628757 0 [Homosapiens] FIC1 Bull, L. N. et al. (1998) Cholestasis. Nat. Genet. 18:219-224 18 7472932CD1 g531469 1.20E−260 [Rattus norvegicus] renalosmotic stress-induced Na-Cl organic solute cotransporter Wasserman, J.C. et al. (1994) Am. J. Physiol. 267: F688-94 19 8463147CD1 g3978472 0[Rattus norvegicus] potassium channel subunit Joiner, W. J. et al.(1998) Nat. Neurosci. 1: 462-469 20 7506408CD1 g3955100 9.40E−71 [Musmusculus] vacuolar adenosine triphosphatase subunit D 586887|Atp6d7.90E−72 [Mus musculus] [Regulatory subunit; Active transporter,primary; Hydrolase; Transporter; ATPase] [Plasma membrane] VacuolarH+−ATPase proton pump subunit D 340040|ATP6D 7.10E−71 [Homo sapiens][Regulatory subunit; Active transporter, primary; Hydrolase;Transporter; ATPase] [Plasma membrane] Vacuolar H+−ATPase proton pump(subunit D), an accessory subunit in the peripheral catalytic V1complex, may be involved in coupling ATP hydrolysis (V1 complex) andproton transport (V0 complex) Agarwal, A. K. and White, P. C. (2000)Biochem. Biophys. Res. Commun. 279: 543-547

[0404] TABLE 3 Amino SEQ Incyte Acid Potential Potential Analytical IDPolypeptide Resi- Phosphorylation Glycosylation Methods NO: ID duesSites Sites Signature Sequences, Domains and Motifs and Databases 16911460CD1 617 S75 S169 S220 N371 N383 Sugar (and other) transporterdomain: S43-L564 HMMER_PFAM S256 S264 S385 N396 N401 S443 T18 T246 T403T520 Transmembrane Domains: E80-R106, A109-S129, TMAP I134-Y154,V168-A188, H194-M214, N274-Y300, A316-D339, G342-M370, A458-L485,G509-M537 N-terminus is non-cytosolic Sugar transport proteins BL00216:G51-S62, L133-A182 BLIMPS_BLOCKS Sugar transport proteins signatures:L119-I184 PROFILESCAN Sugar transporter signature BLIMPS_PRINTS PR00171:G51-I61, I134-V153, L465-V486, S488-M500 Glucose transporter signatureBLIMPS_PRINTS PR00172: I279-Y300, S317-V338, L524-F544, L465-S488,R498-L516, W529-I549 SUGAR TRANSPORT PROTEINS BLAST_DOMODM00135|P09830|101-452: L119-G362 Sugar transport proteins signature 1:G97-S113 MOTIFS 2 55138203CD1 1193 S32 S45 S54 S58 N36 N308 E1-E2 ATPasedomain: K161-S204 HMMER_PFAM S202 S215 S245 N857 S317 S353 S437 S472S491 S534 S580 S586 S593 S644 S727 S796 S848 S943 S1131 S1167 S1175 T14T85 T125 T164 T299 T454 T486 T552 T614 T621 T686 T758 T777 T1108 T1133T1185 Y530 Y608 Y617 Y1031 Transmembrane Domains: R103-S123 T130-I150TMAP E320-W348 N368-K396 C891-F911 C921-E941 V969-G995 G1026-Y1054V1079-T1104 N-terminus is non-cytosolic E1-E2 ATPases phosphorylationsite signature BLIMPS_BLOCKS BL00154: G183-L200, V432-F450, D690-L730,T825- S848 E1-E2 ATPases phosphorylation site: A418-P466 PROFILESCANP-type cation-transporting atpase superfamily BLIMPS_PRINTS signaturePR00119: E213-Q227, F436-F450, A706-D716, I828-I847 ATPASE HYDROLASETRANSMEMBRANE BLAST_PRODOM PHOSPHORYLATION ATPBINDING PROTEIN PROBABLECALCIUMTRANSPORTING CALCIUM TRANSPORT PD004657: S862-R1103 PD004932:R34-P133 CHROMAFFIN GRANULE ATPASE II BLAST_PRODOM HYDROLASETRANSMEMBRANE PHOSPHORYLATION ATPBINDING HOMOLOG PD038238: T1104-W1193PD030421: K732-I801 do ATPASE; CALCIUM; TRANSPORTING; BLAST_DOMODM02405|P39524|236-1049: L116-N926 ATP/GTP-binding site motif A(P-loop): A770-T777, MOTIFS G1124-S1131 E1-E2 ATPases phosphorylationsite: D438-T444 MOTIFS 3 7478871CD1 989 S23 S51 S65 N183 N555HCO3-transporter family domain: L222-I897, K108-V157 HMMER_PFAM S149S261 S304 N582 N606 S309 S369 S795 N985 S800 S936 S953 S966 S968 T158T206 T336 T368 T388 T629 T656 T691 T864 Transmembrane Domains:P227-L247, G260-M280, TMAP D412-L440, Q448-I474, P501-F529, R531-L554,H628-T656, R665-K693, A724-A744, K756-A776, F825-M853, T895-G923N-terminus is non-cytosolic Anion exchangers family signature BL00219:G89-H120, BLIMPS_BLOCKS Q224-L267, S269-R307, A308-K343, S382-A421,V422-D445, L475-F513, L515-I562, P631-D684, W721-L762, D763-R801,G806-L851, Y852-T895, I897-S936 Anion exchangers family signatures:D372-Y424, PROFILESCAN A519-G571 Anion exchanger signature PR00165:F392-L414, BLIMPS_PRINTS Q417-G437, V450-G469, T473-S492, L504-S523,G536-L554, D632-L651, W719-M738 PROTEIN ANION EXCHANGE BLAST_PRODOMTRANSMEMBRANE BAND GLYCOPROTEIN LIPOPROTEIN PALMITATE BICARBONATECOTRANSPORTER PD001455: Q224-L846, S567-I897, L109-R189 BICARBONATECOTRANSPORTER SODIUM BLAST_PRODOM ELECTROGENIC NA+ PANCREASCOTRANSPORTER2 HCO3 TRANSPORTER F52B5.1 PD018437: Q898-N989 BAND 3 ANIONTRANSPORT PROTEIN BLAST_DOMO DM02294|P04920|602-1237: G620-E956,L318-P591, G187-G229 4 7483601CD1 505 S41 S238 S465 N163 N166 Sulfatetransporter family domain: L193-T503 HMMER_PFAM T13 T53 T128 T234 T464T503 Transmembrane Domains: L93-I121, T128-I156, TMAP A179-G199,G212-V232, N258-F278, L286-G306, F336-K364, A417-I445, E468-A495N-terminus is non-cytosolic Sulfate transporters protein signatureBL01130: S86-V139, BLIMPS_BLOCKS S181-V232 SULFATE TRANSPORTER TRANSPORTBLAST_PRODOM PROTEIN TRANSMEMBRANE GLYCOPROTEIN AFFINITY SULPHATE HIGHPERMEASE PD001121: I60-D155 PROTEIN TRANSPORT SULFATE BLAST_PRODOMTRANSPORTER TRANSMEMBRANE PERMEASE INTERGENIC REGION AFFINITYGLYCOPROTEIN PD001255: L257-R502 SULFATE TRANSPORTERS BLAST_DOMODM01229|P40879|5-462: R15-R463 5 7487851CD1 618 S127 S169 S259 N167Xanthine/uracil permeases family domain: G46-E481 HMMER_PFAM S417 S458S491 S590 S609 S616 T321 T522 T537 Transmembrane Domains: P44-C72,P198-L214, TMAP C224-G246, L267-P295, L319-Y343, L364-T383, S400-R419,L424-Y452, A454-A482, D494-E516 N-terminus is non-cytosolicXanthine/uracil permease signature BL01116: R362-G413, BLIMPS_BLOCKSG415-F451 YOLK SAC PERMEASELIKE YSPL1 FORM 1 BLAST_PRODOM YOLK SACPERMEASELIKE YSPL1 FORM 4 YOLK SAC PERMEASELIKE YSPL1 FORM 3 YOLK SACPERMEASELIKE YSPL1 FORM 2 PD019501: G437-Q617 PD137940: Q29-P83XANTHINE/URACIL PERMEASES FAMILY BLAST_DOMO DM01485|S33349|7-188:G363-L473 6 7472881CD1 377 S15 S16 S91 N4 N14 N157 Sodium Bile acidsymporter family domain: T39-W220 HMMER_PFAM S324 S337 T310 T332 T336T374 Signal Peptide: M41-A97 SPSCAN Transmembrane domains: G28-R56A69-S89 V95-F115 TMAP T131-S153 T159-V182 K191-G218 W220-T248 L283-A30PROTEIN TRANSMEMBRANE ACID BLAST_PRODOM COTRANSPORTING POLYPEPTIDETRANSPORT SYMPORT SODIUM/BILE COTRANSPORTER NA+/BILE PD002890: M41-D223ACID COTRANSPORTING POLYPEPTIDE BLAST_PRODOM SODIUM/BILE COTRANSPORTERNA+/BILE SODIUM/TAUROCHOLATE TRANSMEMBRANE TRANSPORT SYMPORT PD007533:W220-R313 do SODIUM; ACID; BILE; TRANSPORTER; BLAST_DOMODM03972|I38655|8-318: L30-K321 DM03972|P09131|163-477: P12-S277DM03972|P26435|1-314: A10-R313 7 7612560CD1 507 S22 S26 S41 N181 N190Transmembrane amino acid transporter protein HMMER_PFAM S261 S341 N477N232 domain: A78-G458 S374 S384 T36 Transmembrane domains: G74-M102,A143-F168, TMAP F208-L236, P266-E286, P296-L316, V342-I370, L381-P401,I407-E427, S437-A462 N-terminus is cytosolic ACID AMINO PROTEINTRANSPORTER BLAST_PRODOM PERMEASE TRANSMEMBRANE INTERGENIC REGIONPUTATIVE PROLINE PD001875: K49-L356 8 2880370CD1 438 S48 S80 S300 N56N85 N99 Signal Peptide: M1-R20, M1-M21, M1-S23 HMMER S407 T15 T38 T92Signal Cleavage: M1-A19 SPSCAN Sodium Bile acid symporter family:L148-D332 HMMER_PFAM Transmembrane domains: K4-R20, A135-F158, I178-TMAP A206, G218-M238, L244-S264, P270-V290, I305-G325, E335-A355,V368-P389, P400-R423 N-terminus is cytosolic PROTEIN TRANSMEMBRANE ACIDBLAST_PRODOM COTRANSPORTING POLYPEPTIDE TRANSPORT SYMPORT SODIUM/BILECOTRANSPORTER NA+/BILE PD002890: L150-D332 P3 PROTEIN TRANSMEMBRANETRANSPORT BLAST_PRODOM SYMPORT PD103884: G317-L416 do SODIUM; ACID;BILE; TRANSPORTER; BLAST_DOMO DM03972|P09131|163-477: V121-L416DM03972|I38655|8-318: I143-C424 DM03972|P26435|1-314: I143-R423 96267489CD1 350 S68 S121 S188 N60 N87 ATP synthase (C/AC39) subunit:Y15-P348 HMMER_PFAM S233 S336 T29 T41 T136 T146 T288 Y84 Y194 Y241 Y294Transmembrane domain: R86-N114 TMAP N-terminus is non-cytosolic SUBUNITVATPASE AC39 VACUOLAR ATP BLAST_PRODOM SYNTHASE HYDROLASE HYDROGEN IONTRANSPORT PD008622: L78-G285 SUBUNIT VATPASE AC39 VACUOLAR ATPBLAST_PRODOM SYNTHASE HYDROLASE HYDROGEN ION TRANSPORT PD013947: L2-R77do AC39; ATP; VACUOLAR; SYNTHASE BLAST_DOMO DM03240|P54641|10-355:G4-I349 DM03240|P12953|1-272: E81-I349 DM03240|P53659|1-363: L2-G286;G201-I349 DM03240|P32366|32-344: N35-I349 10 7484777CD1 1707 S54 S63 S80N40 N111 Transient receptor: Y1096-M1154, R970-E1035, HMMER_PFAM S116S122 S134 N297 N386 P899-L960, D715-W761 S150 S365 S388 N451 N573 S453S519 N729 N732 S554 S681 N942 N1068 S711 S771 S840 N1113 N1211 S841 S900S1037 N1227 N1626 S1170 S1212 S1213 S1222 S1229 S1241 S1278 S1393 S1397S1398 S1405 S1501 S1546 S1595 S1612 S1619 S1639 S1655 S1657 S1668 S1678S1679 S1689 S1694 T42 T162 T300 T575 T612 T613 T1070 T1115 T1137 T1184T1265 T1271 T1285 T1308 T1451 T1465 T1608 T1650 Y70 Y798 Y1010Transmembrane domain: W5-E27, G204-I228, TMAP D550-R578, F865-V893,L937-R959, V975-G995, M1005-A1025, W1087-T1115 N-terminus isnon-cytosolic Transient receptor potential family signatureBLIMPS_PRINTS PR01097: A1094-T1115, F1116-F1129, V1143-M1156 PROTEINMELASTATIN CHROMOSOME BLAST_PRODOM TRANSMEMBRANE C05C12.3 T01H8.5 IF54D1.5 IV PD018035: M154-L486 PROTEIN CHROMOSOME TRANSMEMBRANEBLAST_PRODOM MELASTATIN C05C12.3 T01H8.5 I F54D1.5 IV PD151509:I982-L1270 PROTEIN CHROMOSOME TRANSMEMBRANE BLAST_PRODOM MELASTATINC05C12.3 T01H8.5 I F54D1.5 IV PD039592: E617-E813 PROTEIN MELASTATINCHROMOSOME BLAST_PRODOM TRANSMEMBRANE T01H8.5 I C05C12.3 F54D1.5 IVPD022180: W481-R591 ANK MOTIF REPEAT BLAST_DOMO DM03196|P34586|38-822:I972-C1162 DM03196|P19334|1-772: D962-I1157 DM03196|P48994|13-780:I978-Q1159 11 2493969CD1 771 S34 S156 S186 N163 N282 Transmembranedomains: L49-G76 L77-Y105 V125-A153 TMAP S379 S403 S435 N676 S186-I211G212-Y240 S252-T274 P286-Y314 S468 S488 S499 G330-L350 F355-A375I389-L417 T561-Y589 S677 S682 S703 S594-P622 A629-K649 W655-W675 S716S744 T6 N-terminus is cytosolic T54 T126 T273 T274 T449 T518 T543 T712Amino acid permeases protein signature BL00218: BLIMPS_BLOCKS V56-G84,V87-S118, Y263-L307, A344-T383 AMINO ACID CATIONIC TRANSPORTERBLAST_PRODOM TRANSPORT TRANSMEMBRANE GLYCOPROTEIN TRANSPORTER1 PROTEINHIGH AFFINITY PD000262: V614-L688 TRANSMEMBRANE TRANSPORT PROTEINBLAST_PRODOM TRANSPORTER AMINO ACID PERMEASE AMINO ACID GLYCOPROTEINMEMBRANE PD000214: L49-L421 do ANTIPORTER; ORNITHINE; PUTRESCINE;BLAST_DOMO TRANSPORT; DM01125|P30825|23-373: T47-W241 12 3244593CD1 1329S10 S20 S28 S81 N405 N438 ABC transporter transmembrane region:V123-I391, HMMER_PFAM S156 S208 S216 N540 N602 L766-V1044 S230 S397 S407N803 N951 S448 S473 S491 N1226 S517 S619 S631 S667 S725 S853 S868 S979S1024 S1086 S1128 S1159 S1190 S1228 S1259 T152 T295 T301 T324 T373 T425T452 T483 T575 T649 T684 T752 T805 T857 T875 T1046 T1055 T1091 T1180T1268 Y714 ABC transporter domain: G1117-G1300, G506-G677 HMMER_PFAMTransmembrane domains: F118-H146 V159-F179 TMAP A185-N205 E233-A253G260-M280 A350-R370 S379-K399 T759-L786 H819-T846 F904-F932 N989-S1017N-terminus is non-cytosolic ABC transporters family signature:L585-D634, PROFILESCAN T1208-D1258 ATP-BINDING TRANSPORT TR PD00131:G876-D885, BLIMPS_(—) S1128-V1181, G1275-A1312 PRODOM ATP-BINDINGTRANSPORT TRANSMEMBRANE BLAST_PRODOM PROTEIN GLYCOPROTEIN MULTIDRUGSULFONYLUREA RECEPTOR RESISTANCE ASSOCIATED CONDUCTANCE PD003781:L543-L601 ABC TRANSPORTERS FAMILY BLAST_DOMO DM00008|P33527|1293-1502:I1090-G1300, D490-G677 ABC transporters family signature: L603-V617,MOTIFS F1227-L1241 ATP/GTP-binding site motif A (P-loop): G513-S520MOTIFS G1124-S1131 13 4921451CD1 1353 S11 S53 S146 N637 Transmembranedomains: F130-L158 D394-S422 TMAP S183 S199 V448-L473 R996-A1024F1055-R1083 D1093-V1113 S347 S422 I1117-I1137 S1163-I1191 S500 S513 S532N-terminus is non-cytosolic S592 S638 S644 S841 S865 S876 S900 S1090S1232 S1236 S1244 S1248 S1287 S1295 S1302 S1321 T8 T79 T113 T234 T306T312 T391 T618 T639 T690 T744 T757 T807 T924 T1030 T1272 T1284 Y367 Y431Y706 E1-E2 ATPases phosphoryl BL00154: V508-F526, BLIMPS_BLOCKSD748-L788, T943-A966 E1-E2 ATPases phosphorylation site: A494-P539PROFILESCAN P-type cation-transporting atpase superfamily BLIMPS_PRINTSsignature PR00119: F512-F526, S764-D774, I946-L965 ATPASE HYDROLASETRANSMEMBRANE BLAST_PRODOM PHOSPHORYLATION ATP-BINDING PROTEIN PROBABLECALCIUM TRANSPORTING CALCIUM TRANSPORT PD004657: L981-V1034, G1028-I1180PD006317: A270-D343, F200-P223 PD149930: C920-F979 PROBABLE CALCIUMTRANSPORTING BLAST_PRODOM ATPASE 8 EC 3.6.1.38 HYPOTHETICAL PROTEINHYDROLASE CALCIUM TRANSPORT TRANSMEMBRANE PHOSPHORYLATION MAGNESIUMATP-BINDING PD101227: G582-I768 do ATPASE; CALCIUM; TRANSPORTING;BLAST_DOMO DM02405|P32660|318-1225: A270-E549, P580-L796, R906-G1031,F200-P223 E1-E2 ATPases phosphorylation site: D514-T520 MOTIFS EF-handcalcium-binding domain: D1033-L1045 MOTIFS 14 5547443CD1 921 S5 S46 S74S215 N223 N612 K+ channel tetramerisation domain: D8-H105, Q391-S488HMMER_PFAM S225 S277 S304 S475 S495 S502 S515 S538 S598 S656 S688 S747S808 S829 S855 S881 T42 T57 T67 T127 T163 T329 T337 T364 T609 T614 T686T710 T722 T781 T839 Y529 Y880 do CHANNEL; POTASSIUM; CDRK; SHAW;BLAST_DOMO DM00490|P17971|32-138: N13-P92 (P-value = 8.5e−05) 1556008413CD1 530 S6 S151 S268 N396 N523 Nucleoside transporter domain:L170-S507 HMMER_PFAM S306 S476 T56 T57 T90 T199 T262 T338 TRANSMEMBRANEDOMAINS: R66-Y94 G101-R129 TMAP T134-R162 T231-R256 V348-E375 H380-L408H416-Y436 A447-P467 N-terminus is non-cytosolic PROTEIN NUCLEOSIDETRANSPORTER BLAST_PRODOM TRANSMEMBRANE NUCLEOLAR HNP36 DELAYED EARLYRESPONSE DER12 NUCLEAR PD005103: V182-Y503 16 6127911CD1 1617 S30 S50S134 N71 N84 N91 Signal Peptide: M26-L46 HMMER S249 S353 S491 N109 N130S672 S761 N241 N436 S792 S809 N544 N576 S819 S915 S923 N911 N940 S954S1035 N990 N1305 S1127 S1193 S1269 S1295 S1329 S1488 T111 T206 T558 T572T624 T643 T755 T772 T780 T852 T968 T1172 T1257 T1340 T1370 T1418 T1441T1462 T1545 T1605 Y947 ABC transporter domains: G507-G689, G1313-G1489HMMER_PFAM TRANSMEMBRANE DOMAINS: R25-N53 E221-K247 TMAP A262-I282I292-V312 L322-L342 E356-N382 D392-I420 L848-Y876 H1006-G1034Q1061-Y1081 V1095-M1115 F1132-V1160 C1200-M1226 N-terminus isnon-cytosolic ABC transporters family signature: V595-D646 PROFILESCANABC TRANSPORTERS FAMILY BLAST_DOMO DM00008|P41233|839-1045: I478-N688,K1300-M1486 ATP/GTP-binding site motif A (P-loop): G514-S521, MOTIFSG1320-S1327 17 6427133CD1 1192 S4 S152 S216 N238 N538 TRANSMEMBRANEDOMAINS: A58-L86 D270-W298 TMAP S259 S268 N726 N1165 F327-H353 G862-F890T900-G923 F950-Y978 S296 S366 S391 A995-S1015 H1022-N1042 S1061-K1089S408 S437 S440 S456 S483 S493 S545 S744 S833 S1114 S1115 S1124 S1125S1144 S1157 S1168 T35 T267 T378 T403 T519 T540 T646 T900 T1063 T1095T1120 T1178 T1189 Y22 Y28 Y607 E1-E2 ATPases phosphorylation sitesignature BLIMPS_BLOCKS BL00154: G133-L150, I386-F404, D650-M690,T810-S833 E1-E2 ATPases phosphorylation site: A372-L421 PROFILESCANP-type cation-transporting atpase superfamily BLIMPS_PRINTS signaturePR00119: F390-F404, A666-D676, I813-I832 ATPASE HYDROLASE TRANSMEMBRANEBLAST_PRODOM PHOSPHORYLATION ATPBINDING PROTEIN PROBABLECALCIUMTRANSPORTING CALCIUM TRANSPORT PD004657: S847-P1094 PD006317:Q123-H222 PD149930: C787-Y846 FIC1 PROTEIN BLAST_PRODOM PD180313:H1040-P1154 do ATPASE; CALCIUM; TRANSPORTING; BLAST_DOMODM02405|P39524|236-1049: L66-N696, A755-N911 E1-E2 ATPasesphosphorylation site: D392-T398 MOTIFS 18 7472932CD1 625 S86 S280 S339N144 N168 Sodium: neurotransmitter symporter family domain: HMMER_PFAMS510 S554 T205 N174 N351 R18-L588 T387 T505 T516 T589 T594 T612TRANSMEMBRANE DOMAINS: E17-R43 C48-L76 TMAP Y96-W124 S178-V198 T204-L224P251-N279 V295-N323 P394-T414 E420-A440 C446-E466 A472-Y492 W513-R541P561-T589 N-terminus is non-cytosolic Sodium: neurotransmitter symporterfamily signature BLIMPS_BLOCKS BL00610: Q26-E75, W90-C139, W181-G232,I247-T299, T389-V431, V485-P539, K558-P580 Sodium: neurotransmittersymporter family signatures: PROFILESCAN D22-L76 Sodium/neurotransmittersymporter signature BLIMPS_PRINTS PR00176: Q26-L47, A55-V74, G99-Y125,V208-I225, V290-V310, M393-L412, S474-M494, R514-L534 TRANSPORTERNEUROTRANSMITTER BLAST_PRODOM TRANSPORT TRANSMEMBRANE SYMPORTGLYCOPROTEIN SODIUM CHLORIDE- DEPENDENT SODIUM-DEPENDENT GABA PD000448:L363-R598, R18-D284 ORPHAN TRANSPORTER ISOFORM A12 A11 BLAST_PRODOM B11A8 B9 A10 RENAL PD037829: K314-L368 PD150276: S137-Q180 TRANSMEMBRANETRANSPORT PROTEIN BLAST_PRODOM TRANSPORTER AMINOACID PERMEASE AMINO ACIDGLYCOPROTEIN MEMBRANE PD000214: L28-L311, S375-L534 SODIUM:NEUROTRANSMITTER SYMPORTER BLAST_DOMO FAMILY DM00572|S50998|19-616:A11-R591 19 8463147CD1 1181 S16 S43 S52 S68 N104 N137 TRANSMEMBRANEDOMAINS: L98-L120 W135-L163 TMAP S97 S106 S153 N329 N591 I173-F201K233-L259 L266-D286 V298-Y318 S164 S196 S293 N600 N619 M808-S826V867-T883 S918-Y944 S347 S393 S424 N1044 N1169 N-terminus is cytosolicS425 S531 S651 S674 S697 S709 S854 S907 S937 S973 S996 S1009 S1022 S1060S1068 S1075 S1093 S1162 S1166 S1175 T81 T337 T377 T432 T503 T602 T701T702 T977 T1013 T1046 T1137 T1171 CHANNEL POTASSIUM IONIC CALCIUM-BLAST_PRODOM ACTIVATED ALPHA CALCIUM SUBUNIT ACTIVATED PROTEIN LARGEPD003090: R323-F609, S877-P966, S656-G716, V771-V867, Q1123-I1148 doCHANNEL; POTASSIUM; MSLO; BLAST_DOMO ACTIVATED; DM05442|A48206|351-1123:R323-F609, P927-P966, G777-V867, Q1123-I1148, G1110-E1160ATP/GTP-binding site motif A (P-loop): G1071-T1078 MOTIFS 20 7506408CD1233 S71 S116 S219 ATP synthase (C/AC39) subunit: Y15-P231 HMMER_PFAM T29T171 Y77 Y124 Y177 SUBUNIT VATPASE AC39 VACUOLAR ATP BLAST_PRODOMSYNTHASE HYDROLASE HYDROGEN ION TRANSPORT PD008622: G84-I232, G14-G168ATP; VACUOLAR; SYNTHASE BLAST_DOMO DM03240|P12953|1-272: F46-I232DM03240|P54641|10-355: D32-I232, G4-E43 DM03240|P53659|1-363: G14-I232DM03240|P32366|32-344: V37-I232

[0405] TABLE 4 Polynucleotide SEQ ID NO:/ Incyte ID/Sequence LengthSequence Fragments 21/6911460CB1/ 1-512, 1-756, 5-607, 120-658, 144-421,144-540, 144-598, 144-624, 144-646, 144-681, 144-694, 144-697, 144-727,2232 144-745, 144-769, 144-810, 147-764, 215-899, 320-522, 321-601,321-681, 321-799, 321-817, 321-875, 321-884, 321-888, 321-899, 321-923,322-969, 337-1058, 371-1112, 382-1044, 454-1062, 495-1011, 513-1350,568-1324, 578-1130, 597-1205, 599-1249, 605-1124, 619-834, 620-1356,641-1524, 674-1455, 678-1410, 689-1516, 701-1364, 724-1260, 731-1350,731-1429, 731-1571, 732-1306, 738-1372, 748-1439, 750-1430, 761-1528,765-1440, 772-1332, 785-1275, 806-1463, 819-1363, 843-1439, 848-1458,875-1417, 916-1528, 918-1408, 928-1366, 928-1456, 931-1387, 945-1443,948-1470, 953-1466, 955-1357, 956-1597, 957-1656, 962-1721, 967-1616,969-1596, 1018-1630, 1031-1696, 1032-1730, 1034-1769, 1038-1724,1060-1792, 1142-1787, 1163-1817, 1178-1848, 1179-1845, 1180-1765,1182-1806, 1258-1586, 1258-1672, 1258-1679, 1258-1836, 1258-1840,1258-1848, 1258-1851, 1258-1867, 1258-1881, 1258-1909, 1258-1938,1258-1961, 1258-1967, 1320-1561, 1391-1679, 1504-1746, 1504-1788,1504-1850, 1504-1852, 1504-1863, 1504-1872, 1504-1887, 1504-1888,1504-1892, 1504-1893, 1504-1900, 1504-1905, 1504-1912, 1504-1913,1504-1920, 1504-1935, 1504-1947, 1504-1953, 1504-1960, 1504-1990,1504-2014, 1504-2021, 1504-2031, 1504-2051, 1504-2070, 1504-2089,1504-2098, 1504-2127, 1504-2202, 1504-2217, 1506-2175, 1573-2219,1576-1698, 1579-2222, 1579-2232, 1581-2231, 1583-2212, 1591-2132,1623-1914, 1649-1744, 1652-1955 22/55138203CB1/ 1-735, 3-735, 5-729,5-735, 21-735, 37-735, 87-516, 310-610, 310-758, 310-831, 310-849,518-1026, 529-735, 533-1026, 4135 580-735, 685-735, 687-735, 745-1412,754-1188, 1159-1631, 1159-1640, 1561-1938, 1700-1868, 1875-2532,2221-2367, 2251-2460, 2251-2540, 2368-2835, 2488-3083, 2512-3140,2544-3085, 2580-3188, 2724-3031, 2724-3131, 3112-3707, 3113-3165,3113-3429, 3484-4079, 3556-3737, 3571-3776, 3571-4135, 3612-4095,3614-3733, 3648-4098, 3649-4079, 3650-3717 23/7478871CB1/ 1-302, 1-462,303-462, 406-576, 406-706, 648-2970, 649-809, 649-897, 810-897,810-1052, 898-1052, 898-1169, 2970 1053-1169, 1170-1344, 1170-1478,1255-2103, 1345-1478, 1345-1742, 1479-1742, 1479-1800, 1563-1800,1623-1800, 1764-1800, 1801-1989, 1801-2103, 1990-2103, 1990-2265,2104-2265, 2104-2444, 2107-2343, 2266-2444, 2266-2517, 2445-2517,2518-2586, 2518-2760, 2587-2760, 2587-2916, 2760-2970, 2761-2916,2917-2970 24/7483601CB1/ 1-152, 1-292, 1-890, 34-669, 34-673, 36-673,61-673, 153-292, 153-403, 293-403, 293-570, 404-570, 569-735, 569-890,1835 591-888, 736-890, 820-1233, 820-1543, 820-1578, 820-1617, 827-1233,1309-1643, 1309-1835 25/7487851CB1/ 1-232, 1-367, 1-427, 1-517, 1-578,1-625, 1-703, 1-854, 4-427, 4-558, 5-297, 5-397, 7-686, 79-631, 185-717,233-497, 2220 241-862, 265-808, 271-427, 271-571, 271-670, 271-680,271-772, 271-775, 271-792, 271-796, 271-812, 271-836, 271-886, 271-895,271-955, 273-961, 273-1104, 277-807, 292-947, 293-947, 323-641, 342-935,353-947, 367-427, 382-427, 386-903, 395-427, 397-427, 400-588, 451-657,451-708, 452-485, 452-489, 452-670, 452-886, 452-947, 486-947, 489-1046,493-662, 507-695, 540-1351, 564-1150, 577-1232, 577-1359, 579-1233,581-947, 586-947, 590-947, 612-1225, 621-708, 647-947, 701-1369,708-833, 709-947, 734-835, 734-898, 741-833, 746-947, 758-1355,758-1451, 764-947, 764-1424, 765-1374, 771-1285, 776-947, 777-1530,790-1014, 799-1312, 810-947, 817-1398, 828-1375, 831-948, 841-1505,845-1253, 859-1423, 861-1026, 876-1364, 877-1502, 880-1382, 888-1602,891-1715, 907-1418, 926-1516, 961-1113, 971-1373, 973-1277, 980-1418,991-1161, 1010-1514, 1016-1568, 1032-1563, 1033-1309, 1055-1863,1057-1686, 1059-1200, 1062-1584, 1069-1871, 1070-1241, 1078-1675,1081-1565, 1081-1578, 1104-1738, 1112-1595, 1133-1895, 1154-1682,1157-1762, 1172-1778, 1176-1370, 1184-1918, 1192-1463, 1197-1301,1197-1306, 1201-1306, 1202-1709, 1238-1629, 1250-2078, 1251-2011,1257-1306, 1259-1306, 1263-1828, 1266-1306, 1266-1897, 1268-1752,1283-1306, 1289-1789, 1291-1784, 1301-1810, 1304-1660, 1306-1676,1319-1716, 1319-1721, 1351-1660, 1358-1382, 1358-1497, 1358-1527,1358-1558, 1358-1636, 1358-1660, 1358-1667, 1358-1693, 1358-1717,1358-1738, 1358-1760, 1358-1771, 1358-1843, 1358-1899, 1358-1952,1358-1971, 1358-1974, 1358-1995, 1358-1998, 1358-2044, 1361-1920,1363-2057, 1365-1882, 1365-2051, 1377-1963, 1380-1975, 1421-2090,1427-2012, 1454-2037, 1473-1613, 1482-1952, 1491-1630, 1501-2125,1519-2153, 1519-2178, 1530-2216, 1532-2194, 1548-1955, 1554-1818,1557-1795, 1557-2123, 1566-2219, 1567-2160, 1568-1863, 1570-2053,1570-2220, 1577-2078, 1581-2220, 1597-2220, 1601-2200, 1606-2192,1608-2220, 1623-2220, 1631-1946, 1647-2176, 1651-1931, 1651-2107,1651-2176, 1651-2219, 1651-2220, 1654-1945, 1671-2220, 1672-1841,1674-2220, 1690-2220, 1702-2106, 1728-1941, 1754-2220, 1794-2220,1796-2220, 1835-2220, 1845-2090, 1845-2220, 1867-2220, 1871-2220,1874-2220, 1878-2220, 1898-2220, 1900-2220, 1954-2220, 2019-2220,2037-2220, 2052-2220, 2078-2220, 2094-2220 26/7472881CB1/ 1-236, 47-219,134-622, 243-1070, 245-821, 245-841, 245-888, 245-901, 245-935, 245-951,245-988, 249-622, 250-1073, 1517 292-1071, 292-1073, 309-1073, 347-1073,385-621, 385-622, 386-621, 418-621, 425-1073, 625-1383, 847-1515,847-1517, 877-1517, 893-1073 27/7612560CB1/ 1-258, 1-450, 1-502, 1-548,1-575, 1-595, 1-599, 1-670, 1-679, 1-749, 13-814, 53-278, 53-493,53-495, 53-502, 53-517, 2142 53-546, 53-552, 53-563, 53-601, 53-604,53-609, 53-614, 53-618, 56-597, 56-611, 56-613, 56-614, 195-983,292-979, 301-832, 355-803, 452-1077, 501-1142, 533-845, 536-985,552-1013, 615-1269, 615-1613, 624-880, 631-1030, 641-1231, 686-1269,792-1269, 811-1269, 820-1269, 852-1266, 865-1269, 909-1269, 925-1269,926-1269, 933-1321, 933-1326, 1026-1269, 1070-1269, 1081-1269,1199-1537, 1538-1832, 1553-1806, 1553-2106, 1553-2131, 1553-214228/2880370CB1/ 1-526, 1-528, 1-529, 345-1661, 465-673, 465-838, 465-854,569-833, 1031-1307, 1032-1308, 1032-1590 1661 29/6267489CB1/ 1-280,82-362, 100-369, 100-379, 103-742, 103-795, 112-571, 113-399, 124-653,145-267, 182-441, 182-904, 514-588, 1501 593-1215, 638-1313, 640-1140,640-1149, 640-1152, 686-1314, 735-1225, 739-1287, 739-1289, 772-1358,804-1501, 836-1498, 841-1294, 933-1501 30/7484777CB1/ 1-658, 100-713,100-738, 100-904, 100-931, 250-944, 414-556, 414-593, 414-600, 414-611,414-616, 414-625, 414-703, 5526 414-707, 414-724, 414-750, 414-856,414-884, 414-886, 414-887, 414-903, 414-904, 414-911, 414-912, 414-919,414-928, 414-929, 414-935, 414-939, 414-953, 414-961, 414-972, 414-974,414-1008, 414-1022, 414-1032, 414-1043, 414-1048, 414-1064, 414-1065,414-1077, 414-1084, 414-1108, 414-1118, 414-1154, 414-1179, 414-1180,416-1014, 419-988, 431-563, 431-565, 432-1145, 454-886, 459-1154,469-1095, 486-1018, 486-1033, 492-698, 502-1072, 522-966, 572-1219,622-1337, 644-1123, 644-1211, 659-1329, 666-1155, 676-1331, 676-1332,686-1418, 691-1332, 694-1332, 694-1333, 694-1359, 701-1155, 704-1333,705-1333, 714-1333, 719-1398, 723-1333, 727-1478, 729-1333, 730-1285,731-1155, 736-1426, 746-1333, 746-1419, 752-1418, 773-1419, 775-1333,779-1333, 780-1419, 782-1333, 787-1333, 797-1333, 798-1333, 800-1480,819-1384, 822-1515, 839-1332, 844-1333, 845-1333, 849-1039, 850-1613,887-1333, 890-1333, 892-1333, 906-1333, 908-1138, 910-1384, 912-1333,919-1333, 920-1154, 926-1384, 953-1516, 975-1592, 983-1384, 997-1399,997-1419, 1007-1683, 1038-1525, 1038-1685, 1038-1696, 1038-1699,1038-1773, 1040-1699, 1186-1917, 1220-1898, 1222-1907, 1291-1979,1374-1991, 1635-2139, 1635-2151, 1635-2198, 1635-2454, 1639-2311,2018-2692, 2018-2725, 2081-2852, 2138-2817, 2169-2725, 2263-2929,2283-2940, 2293-2991, 2302-3152, 2312-2786, 2338-2973, 2340-2887,2351-2896, 2352-3152, 2365-3152, 2365-3170, 2382-2886, 2457-2996,2568-3415, 2700-3483, 2705-3313, 2722-3373, 2746-3423, 2765-3236,2770-3423, 2822-3530, 2823-3645, 2845-3703, 2854-3533, 2860-3423,2868-3423, 2876-3423, 2880-3423, 2917-3347, 2946-3423, 2975-3218,2975-3261, 2975-3359, 2975-3361, 2976-3352, 2976-3389, 2976-3393,2976-3419, 2976-3506, 2976-3550, 2986-3478, 3010-3247, 3046-3414,3142-3378, 3142-3600, 3143-3668, 3147-3859, 3236-3721, 3327-4170,3696-4179, 3773-4176, 3773-4196, 3773-4242, 3773-4253, 3773-4271,3773-4274, 3773-4276, 3773-4284, 3773-4290, 3773-4302, 3773-4303,3773-4329, 3773-4337, 3773-4339, 3773-4340, 3773-4350, 3773-4354,3773-4377, 3773-4427, 3773-4467, 3773-4480, 3773-4487, 3773-4505,3773-4521, 3773-4567, 3773-4572, 3775-4299, 3775-4308, 3775-4478,3786-4670, 3804-4253, 3819-4444, 3943-4822, 3964-4798, 3988-4445,3989-4599, 4001-4179, 4204-4615, 4242-4500, 4251-5000, 4266-5000,4309-5000, 4310-4696, 4329-5000, 4474-5000, 4797-5000, 4813-5000,4870-5142, 4870-5335, 4870-5381, 4870-5388, 4870-5406, 4870-5432,4870-5440, 4870-5441, 4870-5449, 4870-5462, 4870-5468, 4870-5515,4870-5516, 4870-5526, 4872-5469, 4873-5245, 4946-5467, 4956-540331/2493969CB1/ 1-701, 1-705, 43-383, 197-760, 293-2536, 980-1126,1174-1443, 1218-1860, 1256-1863, 1339-1863, 1563-1880, 2739 2001-2716,2097-2715, 2156-2703, 2213-2311, 2223-2739, 2299-2739 32/3244593CB1/1-1712, 32-979, 32-1712, 980-2810, 1089-1645, 1089-1661, 1089-1676,1089-1700, 1089-1710, 1089-1711, 1711-3990, 4321 1988-2016, 2282-2628,2282-2631, 2282-2845, 2282-3545, 2285-2629, 2300-2845, 3103-3344,3103-3395, 3103-3528, 3103-3540, 3103-3545, 3103-3573, 3103-3603,3103-3613, 3103-3616, 3103-3620, 3103-3629, 3103-3660, 3103-3687,3103-3708, 3103-3730, 3103-3754, 3103-3772, 3103-3778, 3103-3805,3103-3809, 3103-3836, 3103-3856, 3103-3881, 3106-3789, 3115-3369,3115-3586, 3119-3670, 3132-3545, 3132-3573, 3143-3417, 3143-3545,3177-3545, 3235-3545, 3262-4127, 3275-3545, 3315-3545, 3318-3545,3351-3545, 3355-3944, 3360-3545, 3366-3545, 3380-3545, 3384-3545,3390-3771, 3397-3926, 3415-3545, 3438-3545, 3439-3484, 3444-3545,3445-3545, 3477-3545, 3546-3804, 3546-3839, 3546-3843, 3546-3859,3546-3866, 3546-3868, 3546-3874, 3546-3878, 3546-3884, 3546-3893,3546-3904, 3546-3907, 3546-3927, 3546-3934, 3546-3937, 3546-3953,3546-3954, 3546-3961, 3546-3966, 3546-3977, 3546-3989, 3546-3994,3546-4050, 3546-4057, 3546-4065, 3546-4075, 3546-4103, 3546-4136,3546-4142, 3546-4214, 3552-4084, 3554-4157, 3554-4181, 3554-4229,3555-4218, 3559-4075, 3564-4063, 3573-4079, 3606-4206, 3614-4188,3633-4321, 3641-4321, 3661-4287, 3664-4321, 3666-4292, 3668-4279,3669-4022, 3681-4306, 3683-4223, 3688-4291, 3708-4227, 3713-4243,3716-4314, 3740-4185, 3753-4319, 3769-4287, 3781-4127, 3796-432133/4921451CB1/ 1-246, 1-373, 158-323, 158-373, 258-672, 383-409,383-466, 559-677, 559-751, 559-753, 559-986, 559-1073, 894-1524, 4519898-1382, 973-1484, 973-1555, 1046-4299, 1116-1556, 1181-1839,1255-1529, 1308-1839, 1309-1838, 1343-1821, 1434-1814, 1440-1814,1464-1834, 1488-1839, 1571-1839, 1709-1766, 1847-1987, 3331-3745,3950-4069, 3950-4119, 3950-4160, 3950-4216, 3956-4297, 4067-4516,4076-4519, 4166-4519, 4204-4492, 4242-4518, 4253-4516 34/5547443CB1/1-297, 13-297, 96-364, 96-697, 126-297, 298-2778, 649-889, 749-986,1071-1185, 1071-1744, 1491-1751, 1593-2023, 2922 1820-2297, 1820-2309,1820-2349, 1820-2352, 1981-2904, 2067-2319, 2087-2722, 2128-2840,2173-2605, 2211-2843, 2236-2904, 2238-2863, 2259-2521, 2259-2873,2259-2915, 2271-2838, 2492-2919, 2563-2746, 2587-2922 35/56008413CB1/1-470, 1-499, 1-533, 1-569, 10-501, 26-330, 43-652, 43-678, 43-679,43-756, 43-769, 43-779, 44-779, 51-779, 68-779 2763 322-779, 323-779,418-779, 423-600, 428-600, 457-1236, 544-600, 587-779, 653-1152,653-1161, 707-1313, 922-1618, 1014-1288, 1085-1444, 1094-1349,1094-1705, 1100-1692, 1125-1439, 1278-1836, 1311-1611, 1311-1735,1410-1483, 1459-2056, 1469-1657, 1471-1953, 1479-1758, 1502-2055,1516-2176, 1518-1988, 1533-1657, 1557-2261, 1562-1689, 1614-2220,1632-1924, 1675-1772, 1689-2242, 1689-2329, 1713-2332, 1722-1878,1723-2264, 1729-1858, 1739-2273, 1743-2413, 1748-2276, 1757-2381,1790-2381, 1794-2224, 1799-2273, 1807-2107, 1825-2381, 1853-2463,1857-2434, 1868-2346, 1879-2385, 1882-1986, 1886-2134, 1886-2353,1886-2376, 1886-2380, 1886-2381, 1886-2389, 1886-2393, 1886-2396,1886-2404, 1886-2407, 1886-2414, 1886-2448, 1886-2455, 1886-2456,1886-2461, 1886-2464, 1886-2525, 1886-2538, 1886-2543, 1886-2555,1886-2582, 1886-2602, 1886-2666, 1888-2448, 1888-2526, 1889-2545,1893-2207, 1893-2477, 1897-2135, 1897-2528, 1900-2109, 1902-2088,1919-2477, 1923-2585, 1928-2598, 1953-2325, 1954-2040, 1955-2162,2019-2254, 2024-2526, 2038-2434, 2038-2443, 2038-2459, 2038-2496,2038-2585, 2046-2763, 2051-2556, 2086-2728, 2133-2604, 2144-2622,2151-2642, 2151-2651, 2159-2721, 2171-2760, 2172-2736, 2189-2438,2194-2464, 2194-2642, 2194-2662, 2194-2686, 2194-2729, 2194-2747,2194-2758, 2194-2760, 2194-2761, 2194-2762, 2207-2699, 2219-2701,2219-2763, 2229-2667, 2237-2763, 2243-2714, 2245-2763, 2256-2507,2260-2545, 2268-2546, 2280-2540, 2286-2763, 2291-2761, 2308-2738,2328-2762, 2340-2605, 2346-2746, 2352-2745, 2354-2486, 2357-2746,2367-2707, 2383-2592, 2392-2744, 2396-2763, 2400-2746, 2401-2594,2404-2529, 2407-2746, 2433-2746 36/6127911CB1/ 1-404, 1-442, 1-483,1-510, 1-554, 1-562, 1-566, 1-581, 1-582, 1-597, 1-602, 1-604, 1-621,1-627, 1-633, 1-634, 1-639, 5211 1-640, 2-423, 7-317, 22-640, 26-640,40-640, 44-503, 44-559, 53-582, 54-323, 88-640, 104-640, 138-640,177-640, 248-640, 277-971, 466-640, 483-744, 549-640, 581-640, 641-696,745-815, 745-972, 780-1353, 780-1363, 780-1380, 868-1300, 1035-1804,1099-1637, 1114-2221, 1115-1300, 1130-1839, 1130-1880, 1242-1712,1257-1865, 1273-1915, 1319-1972, 1356-1964, 1375-2003, 1391-1990,1412-2119, 1453-1982, 1453-1983, 1453-2035, 1453-2092, 1453-2101,1453-2130, 1467-2185, 1468-1863, 1479-1967, 1484-2118, 1501-2242,1501-2391, 1589-1982, 1589-2159, 1589-2186, 1591-2160, 1613-2051,1618-2037, 1618-2109, 1618-2118, 1618-2119, 1709-2485, 1710-2119,1739-2236, 1745-2375, 1745-2406, 1794-2486, 1796-2485, 1805-2485,1806-2485, 1840-2486, 1901-2702, 2263-2753, 2264-2553, 2338-2954,2338-2962, 2417-2909, 2417-2987, 2417-2993, 2417-2998, 2417-3003,2417-3029, 2417-3036, 2417-3050, 2453-2591, 2453-2753, 2453-2868,2453-3050, 2459-2882, 2513-2920, 2531-2976, 2547-2784, 2592-3110,2602-2753, 2612-3212, 2618-3086, 2706-3502, 2754-3372, 2767-3266,2767-3294, 2767-3326, 2767-3430, 2768-3353, 2771-3284, 2773-3386,2774-3386, 2821-3307, 2821-3369, 2821-3386, 2822-3085, 2822-3386,2824-3501, 2836-3495, 2844-3490, 2855-3503, 2859-3503, 2864-3503,2870-3503, 2872-3503, 2874-3503, 2875-3331, 2875-3465, 2883-3070,2883-3221, 2883-3252, 2883-3277, 2883-3319, 2883-3348, 2883-3403,2883-3415, 2883-3424, 2883-3488, 2883-3503, 2886-3503, 2888-3503,2892-3503, 2893-3503, 2894-3503, 2900-3503, 2906-3503, 2924-3457,2924-3502, 2924-3503, 2926-3503, 2931-3503, 2948-3503, 2971-3503,2974-3467, 2974-3475, 2974-3502, 2974-3503, 2979-3503, 2983-3476,2983-3503, 2986-3503, 3000-3503, 3001-3503, 3025-3503, 3062-3503,3096-3503, 3214-3503, 3260-3502, 3260-3503, 3271-3493, 3271-3686,3271-3932, 3341-3616, 3341-3787, 3341-3821, 3341-3901, 3341-3943,3341-3954, 3341-3968, 3341-4001, 3341-4003, 3341-4009, 3410-3503,3417-4146, 3539-4192, 3550-4269, 3596-4177, 3680-4297, 3683-4298,3692-4186, 3695-4287, 3734-4536, 3765-4431, 3794-4495, 3816-4498,3826-4329, 3838-4164, 3844-4094, 3847-4496, 3853-4511, 3855-4369,3857-4430, 3860-4372, 3873-4548, 3876-4483, 3878-4427, 3896-4133,3900-4238, 3900-4466, 3904-4455, 3905-4461, 3913-4467, 3914-4544,4029-4763, 4104-4803, 4108-4793, 4109-4740, 4117-4794, 4125-4794,4133-4851, 4139-4792, 4153-4851, 4160-4775, 4166-4834, 4169-4819,4172-4812, 4179-4798, 4194-4867, 4210-4876, 4211-4805, 4216-4837,4216-4909, 4219-4726, 4220-4807, 4220-4929, 4243-4749, 4245-4742,4257-4806, 4258-4992, 4287-4900, 4287-5003, 4289-4926, 4292-4848,4294-4542, 4299-4963, 4310-5014, 4312-4793, 4314-4879, 4319-4851,4332-4883, 4355-4985, 4358-4978, 4361-4879, 4363-4976, 4365-4820,4366-4979, 4370-4989, 4371-5125, 4380-5000, 4386-4985, 4393-4954,4404-4797, 4405-5050, 4422-4873, 4427-5072, 4429-5090, 4430-5036,4432-4975, 4434-4982, 4436-5082, 4450-5062, 4453-4977, 4457-4968,4459-4878, 4459-5037, 4468-5071, 4482-5049, 4486-5175, 4500-5015,4504-5204, 4512-5133, 4520-5094, 4522-5079, 4522-5087, 4523-4854,4530-5211, 4540-5184, 4543-4768, 4544-5135, 4550-5109, 4554-4886,4568-5043, 4579-4849, 4579-4978, 4579-5097, 4581-5131, 4582-4866,4617-5087, 4632-4942, 4632-5104, 4632-5211, 4651-4847, 4651-4864,4653-5138, 4658-4936, 4667-5092, 4668-5211, 4799-5211 37/6427133CB1/1-659, 1-716, 1-725, 1-741, 1-782, 22-518, 209-820, 275-532, 522-595,535-595, 672-899, 688-1061, 688-1184, 688-1240, 5701 688-1256, 747-1256,908-996, 996-1193, 1002-1236, 1002-1612, 1039-1256, 1112-1256,1153-1256, 1189-1612, 1196-1263, 1197-1446, 1447-1613, 1447-1917,1447-2036, 1910-2173, 1910-2594, 2193-2300, 2301-2856, 2631-2744,2669-2952, 2670-2856, 2670-2953, 2757-3430, 2802-2856, 2857-2959,2860-3419, 2938-3210, 2946-3493, 3097-3704, 3097-3763, 3349-3996,3520-3793, 3636-3884, 3707-3988, 3707-4166, 3867-4396, 3878-4150,4026-4615, 4071-4615, 4086-4615, 4087-4554, 4139-4665, 4214-4496,4290-4890, 4405-4766, 4476-4928, 4487-4772, 4499-4781, 4499-4788,4499-5052, 4518-4774, 4555-4936, 4635-4905, 4635-4910, 4642-4803,4693-4922, 4711-4992, 4778-5276, 4800-5384, 4855-5129, 4930-5158,4930-5393, 4939-5436, 4949-5222, 5000-5693, 5000-5701, 5083-5693,5102-5679, 5112-5701, 5118-5693, 5122-5674, 5122-5684, 5150-5452,5163-5664, 5229-5497, 5232-5701, 5233-5693, 5292-5680, 5369-5619,5503-5693, 5601-5694, 5624-5697 38/7472932CB1/ 1-935, 1-1122, 954-1122,965-1788, 967-1787, 1361-1485, 1505-1983, 1541-1987, 1549-1990,1570-1989, 1586-1985, 1990 1681-1988, 1788-1839, 1824-187539/8463147CB1/ 1-204, 159-209, 170-243, 170-752, 290-389, 499-657,674-1434, 675-1434, 767-1295, 769-1008, 769-1302, 773-1434, 3760800-1434, 1234-1764, 1234-1766, 1234-1772, 1303-1921, 1544-1725,1544-2003, 1544-2058, 1922-2139, 2085-2139, 2090-2139, 2090-2414,2242-2733, 2492-3052, 2656-3052, 2694-2971, 2694-3349, 2759-3349,3049-3591, 3055-3349, 3233-3760, 3256-3658 40/7506408CB1/ 1-280, 1-468,1-560, 1-630, 1-1150, 101-200, 105-200, 110-200, 155-739, 178-520,240-798, 256-789, 263-801, 266-962, 1150 287-1072, 290-1072, 384-874,388-936, 388-938, 422-1007, 490-943, 490-1147, 502-1072, 583-1150,668-943

[0406] TABLE 5 Polynucleotide SEQ ID NO: Incyte Project ID:Representative Library 21 6911460CB1 BRAXTDR15 22 55138203CB1  THYMNOR0223 7478871CB1 KIDNNOT32 25 7487851CB1 LUNGNOT37 26 7472881CB1 LIVRTUE0127 7612560CB1 KIDCTME01 28 2880370CB1 ISLTNOT01 29 6267489CB1 KIDETXS0230 7484777CB1 BRADDIR01 31 2493969CB1 BRAINOY02 32 3244593CB1 BRAENOT0233 4921451CB1 PANCTUT01 34 5547443CB1 TESTNOT11 35 56008413CB1 LIVRTUE01 36 6127911CB1 BRSTNOT01 37 6427133CB1 TLYMNOT08 39 8463147CB1BRAIFET02 40 7506408CB1 BONSTUT01

[0407] TABLE 6 Library Vector Library Description BONSTUT01 pINCYLibrary was constructed using RNA isolated from sacral bone tumor tissueremoved from an 18-year-old Caucasian female during an exploratorylaparotomy with soft tissue excision. Pathology indicated giant celltumor of the sacrum. Patient history included a soft tissue malignantneoplasm. Family history included prostate cancer. BRADDIR01 pINCYLibrary was constructed using RNA isolated from diseased choroid plexustissue of the lateral ventricle, removed from the brain of a 57-year-oldCaucasian male, who died from a cerebrovascular accident. BRAENOT02pINCY Library was constructed using RNA isolated from posterior parietalcortex tissue removed from the brain of a 35-year-old Caucasian male whodied from cardiac failure. BRAIFET02 pINCY Library was constructed usingRNA isolated from brain tissue removed from a Caucasian male fetus, whowas stillborn with a hypoplastic left heart at 23 weeks' gestation.BRAINOY02 pINCY This large size-fractionated and normalized library wasconstructed using pooled cDNA generated using mRNA isolated frommidbrain, inferior temporal cortex, medulla, and posterior parietalcortex tissues removed from a 35-year-old Caucasian male who died fromcardiac failure. Pathology indicated moderate leptomeningeal fibrosisand multiple microinfarctions of the cerebral neocortex.Microscopically, the cerebral hemisphere revealed moderate fibrosis ofthe leptomeninges with focal calcifications. There was evidence ofshrunken and slightly eosinophilic pyramidal neurons throughout thecerebral hemispheres. Scattered throughout the cerebral cortex, therewere multiple small microscopic areas of cavitation with surroundinggliosis. Patient history included dilated cardiomyopathy, congestiveheart failure, cardiomegaly and an enlarged spleen and liver. 0.28million independent clones from this size-selected library werenormalized in two rounds using conditions adapted from Soares et al.,PNAS (1994) 91: 9228-9232 and Bonaldo et al., Genome Research 6 (1996):791, except that a significantly longer (48 hours/round) reannealinghybridization was used. BRAXTDR15 PCDNA2.1 This random primed librarywas constructed using RNA isolated from superior parietal neocortextissue removed from a 55-year-old Caucasian female who died fromcholangiocarcinoma. Pathology indicated mild meningeal fibrosispredominately over the convexities, scattered axonal spheroids in thewhite matter of the cingulate cortex and the thalamus, and a fewscattered neurofibrillary tangles in the entorhinal cortex and theperiaqueductal gray region. Pathology for the associated tumor tissueindicated well-differentiated cholangiocarcinoma of the liver withresidual or relapsed tumor. Patient history included cholangiocarcinoma,post-operative Budd-Chiari syndrome, biliary ascites, hydrothorax,dehydration, malnutrition, oliguria and acute renal failure. Previoussurgeries included cholecystectomy and resection of 85% of the liver.BRSTNOT01 PBLUESCRIPT Library was constructed using RNA isolated fromthe breast tissue of a 56-year-old Caucasian female who died in a motorvehicle accident. ISLTNOT01 pINCY Library was constructed using RNAisolated from a pooled collection of pancreatic islet cells. KIDCTME01PCDNA2.1 This 5′ biased random primed library was constructed using RNAisolated from kidney cortex tissue removed from a 65-year-old maleduring nephroureterectomy. Pathology indicated the margins of resectionwere free of involvement. Pathology for the matched tumor tissueindicated grade 3 renal cell carcinoma, clear cell type, forming avariegated multicystic mass situated within the mid-portion of thekidney. The tumor invaded deeply into but not through the renal capsule.KIDETXS02 pINCY This subtracted, transformed embryonal cell line librarywas constructed using 9 million clones from a treated, transformedembryonal cell line (293-EBNA) derived from kidney epithelial tissue andwas subjected to two rounds of subtraction hybridization with 1.9million clones from an untreated transformed embryonal cell line(293-EBNA) derived from a kidney epithelial tissue library. The startinglibrary for subtraction was constructed using RNA isolated from thetreated, transformed embryonal cell line (293-EBNA). The cells weretreated with 5-aza-2′-deoxycytidine and transformed with adenovirus 5DNA. The hybridization probe for subtraction was derived from asimilarly constructed library from RNA isolated from untreated 293-EBNAcells from the same cell line. Subtractive hybridization conditions werebased on the methodologies of Swaroop et al., NAR 19 (1991): 1954 andBonaldo, et al. Genome Research (1996) 6: 791. KIDNNOT32 pINCY Librarywas constructed using RNA isolated from kidney tissue removed from a49-year-old Caucasian male who died from an intracranial hemorrhage andcerebrovascular accident. Patient history included tobacco abuse.LIVRTUE01 PCDNA2.1 This 5′ biased random primed library was constructedusing RNA isolated from liver tumor tissue removed from a 72-year-oldCaucasian male during partial hepatectomy. Pathology indicatedmetastatic grade 2 (of 4) neuroendocrine carcinoma forming a mass. Thepatient presented with metastatic liver cancer. Patient history includedbenign hypertension, type I diabetes, prostatic hyperplasia, prostatecancer, alcohol abuse in remission, and tobacco abuse in remission.Previous surgeries included destruction of a pancreatic lesion, closedprostatic biopsy, transurethral prostatectomy, removal of bilateraltestes and total splenectomy. Patient medications included Eulexin,Hytrin, Proscar, Ecotrin, and insulin. Family history includedatherosclerotic coronary artery disease and acute myocardial infarctionin the mother; atherosclerotic coronary artery disease and type IIdiabetes in the father. LUNGNOT37 pINCY Library was constructed usingRNA isolated from lung tissue removed from a 15-year-old Caucasianfemale who died from a closed head injury. Serology was positive forcytomegalovirus. PANCTUT01 pINCY Library was constructed using RNAisolated from pancreatic tumor tissue removed from a 65-year-oldCaucasian female during radical subtotal pancreatectomy. Pathologyindicated an invasive grade 2 adenocarcinoma. Patient history includedtype II diabetes, osteoarthritis, cardiovascular disease, benignneoplasm in the large bowel, and a cataract. Previous surgeries includeda total splenectomy, cholecystectomy, and abdominal hysterectomy. Familyhistory included cardiovascular disease, type II diabetes, and stomachcancer. TESTNOT11 pINCY Library was constructed using RNA isolated fromtesticular tissue removed from a 16-year-old Caucasian male who diedfrom hanging. Patient history included drug use (tobacco, marijuana, andcocaine use), and medications included Lithium, Ritalin, and Paxil.THYMNOR02 pINCY The library was constructed using RNA isolated fromthymus tissue removed from a 2-year-old Caucasian female during athymectomy and patch closure of left atrioventricular fistula. Pathologyindicated there was no gross abnormality of the thymus. The patientpresented with congenital heart abnormalities. Patient history includeddouble inlet left ventricle and a rudimentary right ventricle, pulmonaryhypertension, cyanosis, subaortic stenosis, seizures, and a fracture ofthe skull base. Family history included reflux neuropathy. TLYMNOT08pINCY The library was constructed using RNA isolated fromanergicallogenic T-lymphocyte tissue removed from an adult(40-50-year-old) Caucasian male.The cells were incubated for 3 days inthe presence of 1 microgram/ml OKT3 mAb and 5% human serum.

[0408] TABLE 7 Parameter Program Description Reference Threshold ABI Aprogram that removes vector sequences and Applied Biosystems, FosterCity, CA. FACTURA masks ambiguous bases in nucleic acid sequences. ABI/A Fast Data Finder useful in comparing and Applied Biosystems, FosterCity, CA; Mismatch PARACEL annotating amino acid or nucleic acidsequences. Paracel Inc., Pasadena, CA. <50% FDF ABI A program thatassembles nucleic acid sequences. Applied Biosystems, Foster City, CA.AutoAssembler BLAST A Basic Local Alignment Search Tool useful inAltschul, S. F. et al. (1990) J. Mol. Biol. ESTs: sequence similaritysearch for amino acid and 215: 403-410; Altschul, S. F. et al. (1997)Probability nucleic acid sequences. BLAST includes five Nucleic AcidsRes. 25: 3389-3402. value = 1.0E−8 functions: blastp, blastn, blastx,tblastn, and tblastx. or less Full Length sequences: Probability value =1.0E−10 or less FASTA A Pearson and Lipman algorithm that searches forPearson, W. R. and D. J. Lipman (1988) Proc. ESTs: fasta E similaritybetween a query sequence and a group of Natl. Acad Sci. USA 85:2444-2448; Pearson, value = sequences of the same type. FASTA comprisesas W. R. (1990) Methods Enzymol. 183: 63-98; 1.06E−6 least fivefunctions: fasta, tfasta, fastx, tfastx, and and Smith, T. F. and M. S.Waterman (1981) Assembled ssearch. Adv. Appl. Math. 2: 482-489. ESTs:fasta Identity = 95% or greater and Match length = 200 bases or greater;fastx E value = 1.0E−8 or less Full Length sequences: fastx score = 100or greater BLIMPS A BLocks IMProved Searcher that matches a Henikoff, S.and J. G. Henikoff (1991) Nucleic Probability sequence against those inBLOCKS, PRINTS, Acids Res. 19: 6565-6572; Henikoff, J. G. and value =1.0E−3 DOMO, PRODOM, and PFAM databases to search S. Henikoff (1996)Methods Enzymol. or less for gene families, sequence homology, andstructural 266: 88-105; and Attwood, T. K. et al. (1997) J. fingerprintregions. Chem. Inf. Comput. Sci. 37: 417-424. HMMER An algorithm forsearching a query sequence against Krogh, A. et al. (1994) J. Mol. Biol.PFAM or hidden Markov model (HMM)-based databases of 235: 1501-1531;Sonnhammer, E. L. L. et al. SMART hits: protein family consensussequences, such as PFAM (1988) Nucleic Acids Res. 26: 320-322;Probability and SMART. Durbin, R. et al. (1998) Our World View, in avalue = 1.0E−3 Nutshell, Cambridge Univ. Press, pp. 1-350. or lessSignal peptide hits: Score = 0 or greater ProfileScan An algorithm thatsearches for structural and sequence Gribskov, M. et al. (1988) CABIOS4: 61-66; Normalized motifs in protein sequences that match sequencepatterns Gribskov, M. et al. (1989) Methods Enzymol. quality score ≧defined in Prosite. 183: 146-159; Bairoch, A. et al. (1997)GCG-specified Nucleic Acids Res. 25: 217-221. “HIGH” value for thatparticular Prosite motif. Generally, score = 1.4-2.1. Phred Abase-calling algorithm that examines automated Ewing, B. et al. (1998)Genome Res. sequencer traces with high sensitivity and probability. 8:175-185; Ewing, B. and P. Green (1998) Genome Res. 8: 186-194. Phrap APhils Revised Assembly Program including SWAT and Smith, T. F. and M. S.Waterman (1981) Adv. Score = 120 or CrossMatch, programs based onefficient implementation Appl. Math. 2: 482-489; Smith, T.F. and M.S.greater; of the Smith-Waterman algorithm, useful in searching Waterman(1981) J. Mol. Biol. 147: 195-197; Match length = sequence homology andassembling DNA sequences. and Green, P., University of Washington, 56 orgreater Seattle, WA. Consed A graphical tool for viewing and editingPhrap assemblies. Gordon, D. et al. (1998) Genome Res. 8: 195-202.SPScan A weight matrix analysis program that scans protein Nielson, H.et al. (1997) Protein Engineering Score = 3.5 or sequences for thepresence of secretory signal peptides. 10: 1-6; Claverie, J.M. and S.Audic (1997) greater CABIOS 12: 431-439. TMAP A program that uses weightmatrices to delineate Persson, B. and P. Argos (1994) J. Mol. Biol.transmembrane segments on protein sequences and 237: 182-192; Persson,B. and P. Argos (1996) determine orientation. Protein Sci. 5: 363-371.TMHMMER A program that uses a hidden Markov model (HMM) to Sonnhammer,E. L. et al. (1998) Proc. Sixth Intl. delineate transmembrane segmentson protein sequences Conf. on Intelligent Systems for Mol. Biol., anddetermine orientation. Glasgow et al., eds., The Am. Assoc. forArtificial Intelligence Press, Menlo Park, CA, pp. 175-182. Motifs Aprogram that searches amino acid sequences for patterns Bairoch, A. etal. (1997) Nucleic Acids that matched those defined in Prosite. Res. 25:217-221; Wisconsin Package Program Manual, version 9, page M51-59,Genetics Computer Group, Madison, WI.

[0409]

1 40 1 617 PRT Homo sapiens misc_feature Incyte ID No 6911460CD1 1 MetVal Pro Val Glu Asn Thr Glu Gly Pro Ser Leu Leu Asn Gln 1 5 10 15 LysGly Thr Ala Val Glu Thr Glu Gly Ser Gly Ser Arg His Pro 20 25 30 Pro TrpAla Arg Gly Cys Gly Met Phe Thr Phe Leu Ser Ser Val 35 40 45 Thr Ala AlaVal Ser Gly Leu Leu Val Gly Tyr Glu Leu Gly Ile 50 55 60 Ile Ser Gly AlaLeu Leu Gln Ile Lys Thr Leu Leu Ala Leu Ser 65 70 75 Cys His Glu Gln GluMet Val Val Ser Ser Leu Val Ile Gly Ala 80 85 90 Leu Leu Ala Ser Leu ThrGly Gly Val Leu Ile Asp Arg Tyr Gly 95 100 105 Arg Arg Thr Ala Ile IleLeu Ser Ser Cys Leu Leu Gly Leu Gly 110 115 120 Ser Leu Val Leu Ile LeuSer Leu Ser Tyr Thr Val Leu Ile Val 125 130 135 Gly Arg Ile Ala Ile GlyVal Ser Ile Ser Leu Ser Ser Ile Ala 140 145 150 Thr Cys Val Tyr Ile AlaGlu Ile Ala Pro Gln His Arg Arg Gly 155 160 165 Leu Leu Val Ser Leu AsnGlu Leu Met Ile Val Ile Gly Ile Leu 170 175 180 Ser Ala Tyr Ile Ser AsnTyr Ala Phe Ala Asn Val Phe His Gly 185 190 195 Trp Lys Tyr Met Phe GlyLeu Val Ile Pro Leu Gly Val Leu Gln 200 205 210 Ala Ile Ala Met Tyr PheLeu Pro Pro Ser Pro Arg Phe Leu Val 215 220 225 Met Lys Gly Gln Glu GlyAla Ala Ser Lys Val Leu Gly Arg Leu 230 235 240 Arg Ala Leu Ser Asp ThrThr Glu Glu Leu Thr Val Ile Lys Ser 245 250 255 Ser Leu Lys Asp Glu TyrGln Tyr Ser Phe Trp Asp Leu Phe Arg 260 265 270 Ser Lys Asp Asn Met ArgThr Arg Ile Met Ile Gly Leu Thr Leu 275 280 285 Val Phe Phe Val Gln IleThr Gly Gln Pro Asn Ile Leu Phe Tyr 290 295 300 Ala Ser Thr Val Leu LysSer Val Gly Phe Gln Ser Asn Glu Ala 305 310 315 Ala Ser Leu Ala Ser ThrGly Val Gly Val Val Lys Val Ile Ser 320 325 330 Thr Ile Pro Ala Thr LeuLeu Val Asp His Val Gly Ser Lys Thr 335 340 345 Phe Leu Cys Ile Gly SerSer Val Met Ala Ala Ser Leu Val Thr 350 355 360 Met Gly Ile Val Asn LeuAsn Ile His Met Asn Phe Thr His Ile 365 370 375 Cys Arg Ser His Asn SerIle Asn Gln Ser Leu Asp Glu Ser Val 380 385 390 Ile Tyr Gly Pro Gly AsnLeu Ser Thr Asn Asn Asn Thr Leu Arg 395 400 405 Asp His Phe Lys Gly IleSer Ser His Ser Arg Ser Ser Leu Met 410 415 420 Pro Leu Arg Asn Asp ValAsp Lys Arg Gly Glu Thr Thr Ser Ala 425 430 435 Ser Leu Leu Asn Ala GlyLeu Ser His Thr Glu Tyr Gln Ile Val 440 445 450 Thr Asp Pro Gly Asp ValPro Ala Phe Leu Lys Trp Leu Ser Leu 455 460 465 Ala Ser Leu Leu Val TyrVal Ala Ala Phe Ser Ile Gly Leu Gly 470 475 480 Pro Met Pro Trp Leu ValLeu Ser Glu Ile Phe Pro Gly Gly Ile 485 490 495 Arg Gly Arg Ala Met AlaLeu Thr Ser Ser Met Asn Trp Gly Ile 500 505 510 Asn Leu Leu Ile Ser LeuThr Phe Leu Thr Val Thr Asp Leu Ile 515 520 525 Gly Leu Pro Trp Val CysPhe Ile Tyr Thr Ile Met Ser Leu Ala 530 535 540 Ser Leu Leu Phe Val ValMet Phe Ile Pro Glu Thr Lys Gly Cys 545 550 555 Ser Leu Glu Gln Ile SerMet Glu Leu Ala Lys Val Asn Tyr Val 560 565 570 Lys Asn Asn Ile Cys PheMet Ser His His Gln Glu Glu Leu Val 575 580 585 Pro Lys Gln Pro Gln LysArg Lys Pro Gln Glu Gln Leu Leu Glu 590 595 600 Cys Asn Lys Leu Cys GlyArg Gly Gln Ser Arg Gln Leu Ser Pro 605 610 615 Glu Thr 2 1193 PRT Homosapiens misc_feature Incyte ID No 55138203CD1 2 Met Tyr Ser Ala Asn IleGly Tyr Leu Leu Phe Val Gly Thr Gly 1 5 10 15 Val Glu Lys Met Asn AsnThr Pro Ser Met Ala Leu Gly Ser Ser 20 25 30 His Ser Gly Arg Gly Asn LeuThr Gln Ala Ala Thr Lys Pro Ser 35 40 45 Gly Tyr Glu Lys Thr Asp Asp ValSer Glu Lys Thr Ser Leu Ala 50 55 60 Asp Gln Glu Glu Val Arg Thr Ile PheIle Asn Gln Pro Gln Leu 65 70 75 Thr Lys Phe Cys Asn Asn His Val Ser ThrAla Lys Tyr Asn Ile 80 85 90 Ile Thr Phe Leu Pro Arg Phe Leu Tyr Ser GlnPhe Arg Arg Ala 95 100 105 Ala Asn Ser Phe Phe Leu Phe Ile Ala Leu LeuGln Gln Ile Pro 110 115 120 Asp Val Ser Pro Thr Gly Arg Tyr Thr Thr LeuVal Pro Leu Leu 125 130 135 Phe Ile Leu Ala Val Ala Ala Ile Lys Glu IleIle Glu Asp Ile 140 145 150 Lys Arg His Lys Ala Asp Asn Ala Val Asn LysLys Gln Thr Gln 155 160 165 Val Leu Arg Asn Gly Ala Trp Glu Ile Val HisTrp Glu Lys Val 170 175 180 Asn Val Gly Asp Ile Val Ile Ile Lys Gly LysGlu Tyr Ile Pro 185 190 195 Ala Asp Thr Val Leu Leu Ser Ser Ser Glu ProGln Ala Met Cys 200 205 210 Tyr Ile Glu Thr Ser Asn Leu Asp Gly Glu ThrAsn Leu Lys Ile 215 220 225 Arg Gln Gly Leu Pro Ala Thr Ser Asp Ile LysAsp Val Asp Ser 230 235 240 Leu Met Arg Ile Ser Gly Arg Ile Glu Cys GluSer Pro Asn Arg 245 250 255 His Leu Tyr Asp Phe Val Gly Asn Ile Arg LeuAsp Gly His Gly 260 265 270 Thr Val Pro Leu Gly Ala Asp Gln Ile Leu LeuArg Gly Ala Gln 275 280 285 Leu Arg Asn Thr Gln Trp Val His Gly Ile ValVal Tyr Thr Gly 290 295 300 His Asp Thr Lys Leu Met Gln Asn Ser Thr SerPro Pro Leu Lys 305 310 315 Leu Ser Asn Val Glu Arg Ile Thr Asn Val GlnIle Leu Ile Leu 320 325 330 Phe Cys Ile Leu Ile Ala Met Ser Leu Val CysSer Val Gly Ser 335 340 345 Ala Ile Trp Asn Arg Arg His Ser Gly Lys AspTrp Tyr Leu Asn 350 355 360 Leu Asn Tyr Gly Gly Ala Ser Asn Phe Gly LeuAsn Phe Leu Thr 365 370 375 Phe Ile Ile Leu Phe Asn Asn Leu Ile Pro IleSer Leu Leu Val 380 385 390 Thr Leu Glu Val Val Lys Phe Thr Gln Ala TyrPhe Ile Asn Trp 395 400 405 Asp Leu Asp Met His Tyr Glu Pro Thr Asp ThrAla Ala Met Ala 410 415 420 Arg Thr Ser Asn Leu Asn Glu Glu Leu Gly GlnVal Lys Tyr Ile 425 430 435 Phe Ser Asp Lys Thr Gly Thr Leu Thr Cys AsnVal Met Gln Phe 440 445 450 Lys Lys Cys Thr Ile Ala Gly Val Ala Tyr GlyHis Val Pro Glu 455 460 465 Pro Glu Asp Tyr Gly Cys Ser Pro Asp Glu TrpGln Asn Ser Gln 470 475 480 Phe Gly Asp Glu Lys Thr Phe Ser Asp Ser SerLeu Leu Glu Asn 485 490 495 Leu Gln Asn Asn His Pro Thr Ala Pro Ile IleCys Glu Phe Leu 500 505 510 Thr Met Met Ala Val Cys His Thr Ala Val ProGlu Arg Glu Gly 515 520 525 Asp Lys Ile Ile Tyr Gln Ala Ala Ser Pro AspGlu Gly Ala Leu 530 535 540 Val Arg Ala Ala Lys Gln Leu Asn Phe Val PheThr Gly Arg Thr 545 550 555 Pro Asp Ser Val Ile Ile Asp Ser Leu Gly GlnGlu Glu Arg Tyr 560 565 570 Glu Leu Leu Asn Val Leu Glu Phe Thr Ser AlaArg Lys Arg Met 575 580 585 Ser Val Ile Val Arg Thr Pro Ser Gly Lys LeuArg Leu Tyr Cys 590 595 600 Lys Gly Ala Asp Thr Val Ile Tyr Asp Arg LeuAla Glu Thr Ser 605 610 615 Lys Tyr Lys Glu Ile Thr Leu Lys His Leu GluGln Phe Ala Thr 620 625 630 Glu Gly Leu Arg Thr Leu Cys Phe Ala Val AlaGlu Ile Ser Glu 635 640 645 Ser Asp Phe Gln Glu Trp Arg Ala Val Tyr GlnArg Ala Ser Thr 650 655 660 Ser Val Gln Asn Arg Leu Leu Lys Leu Glu GluSer Tyr Glu Leu 665 670 675 Ile Glu Lys Asn Leu Gln Leu Leu Gly Ala ThrAla Ile Glu Asp 680 685 690 Lys Leu Gln Asp Gln Val Pro Glu Thr Ile GluThr Leu Met Lys 695 700 705 Ala Asp Ile Lys Ile Trp Ile Leu Thr Gly AspLys Gln Glu Thr 710 715 720 Ala Ile Asn Ile Gly His Ser Cys Lys Leu LeuLys Lys Asn Met 725 730 735 Gly Met Ile Val Ile Asn Glu Gly Ser Leu AspGly Thr Arg Glu 740 745 750 Thr Leu Ser Arg His Cys Thr Thr Leu Gly AspAla Leu Arg Lys 755 760 765 Glu Asn Asp Phe Ala Leu Ile Ile Asp Gly LysThr Leu Lys Tyr 770 775 780 Ala Leu Thr Phe Gly Val Arg Gln Tyr Phe LeuAsp Leu Ala Leu 785 790 795 Ser Cys Lys Ala Val Ile Cys Cys Arg Val SerPro Leu Gln Lys 800 805 810 Ser Glu Val Val Glu Met Val Lys Lys Gln ValLys Val Val Thr 815 820 825 Leu Ala Ile Gly Asp Gly Ala Asn Asp Val SerMet Ile Gln Thr 830 835 840 Ala His Val Gly Val Gly Ile Ser Gly Asn GluGly Leu Gln Ala 845 850 855 Ala Asn Ser Ser Asp Tyr Ser Ile Ala Gln PheLys Tyr Leu Lys 860 865 870 Asn Leu Leu Met Ile His Gly Ala Trp Asn TyrAsn Arg Val Ser 875 880 885 Lys Cys Ile Leu Tyr Cys Phe Tyr Lys Asn IleVal Leu Tyr Ile 890 895 900 Ile Glu Ile Trp Phe Ala Phe Val Asn Gly PheSer Gly Gln Ile 905 910 915 Leu Phe Glu Arg Trp Cys Ile Gly Leu Tyr AsnVal Met Phe Thr 920 925 930 Ala Met Pro Pro Leu Thr Leu Gly Ile Phe GluArg Ser Cys Arg 935 940 945 Lys Glu Asn Met Leu Lys Tyr Pro Glu Leu TyrLys Thr Ser Gln 950 955 960 Asn Ala Leu Asp Phe Asn Thr Lys Val Phe TrpVal His Cys Leu 965 970 975 Asn Gly Leu Phe His Ser Val Ile Leu Phe TrpPhe Pro Leu Lys 980 985 990 Ala Leu Gln Tyr Gly Thr Ala Phe Gly Asn GlyLys Thr Ser Asp 995 1000 1005 Tyr Leu Leu Leu Gly Asn Phe Val Tyr ThrPhe Val Val Ile Thr 1010 1015 1020 Val Cys Leu Lys Ala Gly Leu Glu ThrSer Tyr Trp Thr Trp Phe 1025 1030 1035 Ser His Ile Ala Ile Trp Gly SerIle Ala Leu Trp Val Val Phe 1040 1045 1050 Leu Gly Ile Tyr Ser Ser LeuTrp Pro Ala Ile Pro Met Ala Pro 1055 1060 1065 Asp Met Ser Gly Glu AlaAla Met Leu Phe Ser Ser Gly Val Phe 1070 1075 1080 Trp Met Gly Leu LeuPhe Ile Pro Val Ala Ser Leu Leu Leu Asp 1085 1090 1095 Val Val Tyr LysVal Ile Lys Arg Thr Ala Phe Lys Thr Leu Val 1100 1105 1110 Asp Glu ValGln Glu Leu Glu Ala Lys Ser Gln Asp Pro Gly Ala 1115 1120 1125 Val ValLeu Gly Lys Ser Leu Thr Glu Arg Ala Gln Leu Leu Lys 1130 1135 1140 AsnVal Phe Lys Lys Asn His Val Asn Leu Tyr Arg Ser Glu Ser 1145 1150 1155Leu Gln Gln Asn Leu Leu His Gly Tyr Ala Phe Ser Gln Asp Glu 1160 11651170 Asn Gly Ile Val Ser Gln Ser Glu Val Ile Arg Ala Tyr Asp Thr 11751180 1185 Thr Lys Gln Arg Pro Asp Glu Trp 1190 3 989 PRT Homo sapiensmisc_feature Incyte ID No 7478871CD1 3 Met Gln Pro Ala Arg Gly Pro LeuAla Ser Glu Pro Arg Thr Val 1 5 10 15 Leu Val Leu Arg Phe Cys Ala SerLeu Met Glu Met Lys Leu Pro 20 25 30 Gly Gln Glu Gly Phe Glu Ala Ser SerAla Pro Arg Asn Ile Pro 35 40 45 Ser Gly Glu Leu Asp Ser Asn Pro Asp ProGly Thr Gly Pro Ser 50 55 60 Pro Asp Gly Pro Ser Asp Thr Glu Ser Lys GluLeu Gly Val Pro 65 70 75 Lys Asp Pro Leu Leu Phe Ile Gln Leu Asn Glu LeuLeu Gly Trp 80 85 90 Pro Gln Ala Leu Glu Trp Arg Glu Thr Gly Thr Trp ValLeu Phe 95 100 105 Glu Glu Lys Leu Glu Val Ala Ala Gly Arg Trp Ser AlaPro His 110 115 120 Val Pro Thr Leu Ala Leu Pro Ser Leu Gln Lys Leu ArgSer Leu 125 130 135 Leu Ala Glu Gly Leu Val Leu Leu Asp Cys Pro Ala GlnSer Leu 140 145 150 Leu Glu Leu Val Glu Gln Val Thr Arg Val Glu Ser LeuSer Pro 155 160 165 Glu Leu Arg Gly Gln Leu Gln Ala Leu Leu Leu Gln ArgPro Gln 170 175 180 His Tyr Asn Gln Thr Thr Gly Thr Arg Pro Cys Trp GlyGlu Ser 185 190 195 Pro Ser Leu Gly Pro Gly Pro Arg Pro Cys Thr Thr ArgPro Gln 200 205 210 Ala Pro Gly Pro Ala Gly Gln Cys Gln Asn Pro Leu ArgGln Lys 215 220 225 Leu Pro Pro Gly Ala Glu Ala Gly Thr Val Leu Ala GlyGlu Leu 230 235 240 Gly Phe Leu Ala Gln Pro Leu Gly Ala Phe Val Arg LeuArg Asn 245 250 255 Pro Val Val Leu Gly Ser Leu Thr Glu Val Ser Leu ProSer Arg 260 265 270 Phe Phe Cys Leu Leu Leu Gly Pro Cys Met Leu Gly LysGly Tyr 275 280 285 His Glu Met Gly Arg Ala Ala Ala Val Leu Leu Ser AspPro Gln 290 295 300 Phe Gln Trp Ser Val Arg Arg Ala Ser Asn Leu His AspLeu Leu 305 310 315 Ala Ala Leu Asp Ala Phe Leu Glu Glu Val Thr Val LeuPro Pro 320 325 330 Gly Arg Trp Asp Pro Thr Ala Arg Ile Pro Pro Pro LysCys Leu 335 340 345 Pro Ser Gln His Lys Arg Leu Pro Ser Gln Gln Arg GluIle Arg 350 355 360 Gly Pro Ala Val Pro Arg Leu Thr Ser Ala Glu Asp ArgHis Arg 365 370 375 His Gly Pro His Ala His Ser Pro Glu Leu Gln Arg ThrGly Arg 380 385 390 Leu Phe Gly Gly Leu Ile Gln Asp Val Arg Arg Lys ValPro Trp 395 400 405 Tyr Pro Ser Asp Phe Leu Asp Ala Leu His Leu Gln CysPhe Ser 410 415 420 Ala Val Leu Tyr Ile Tyr Leu Ala Thr Val Thr Asn AlaIle Thr 425 430 435 Phe Gly Gly Leu Leu Gly Asp Ala Thr Asp Gly Ala GlnGly Val 440 445 450 Leu Glu Ser Phe Leu Gly Thr Ala Val Ala Gly Ala AlaPhe Cys 455 460 465 Leu Met Ala Gly Gln Pro Leu Thr Ile Leu Ser Ser ThrGly Pro 470 475 480 Val Leu Val Phe Glu Arg Leu Leu Phe Ser Phe Ser ArgAsp Tyr 485 490 495 Ser Leu Asp Tyr Leu Pro Phe Arg Leu Trp Val Gly IleTrp Val 500 505 510 Ala Thr Phe Cys Leu Val Leu Val Ala Thr Glu Ala SerVal Leu 515 520 525 Val Arg Tyr Phe Thr Arg Phe Thr Glu Glu Gly Phe CysAla Leu 530 535 540 Ile Ser Leu Ile Phe Ile Tyr Asp Ala Val Gly Lys MetLeu Asn 545 550 555 Leu Thr His Thr Tyr Pro Ile Gln Lys Pro Gly Ser SerAla Tyr 560 565 570 Gly Cys Leu Cys Gln Tyr Pro Gly Pro Gly Gly Asn GluSer Gln 575 580 585 Trp Ile Arg Thr Arg Pro Lys Asp Arg Asp Asp Ile ValSer Met 590 595 600 Asp Leu Gly Leu Ile Asn Ala Ser Leu Leu Pro Pro ProGlu Cys 605 610 615 Thr Arg Gln Gly Gly His Pro Arg Gly Pro Gly Cys HisThr Val 620 625 630 Pro Asp Ile Ala Phe Phe Ser Leu Leu Leu Phe Leu ThrSer Phe 635 640 645 Phe Phe Ala Met Ala Leu Lys Cys Val Lys Thr Ser ArgPhe Phe 650 655 660 Pro Ser Val Val Arg Lys Gly Leu Ser Asp Phe Ser SerVal Leu 665 670 675 Ala Ile Leu Leu Gly Cys Gly Leu Asp Ala Phe Leu GlyLeu Ala 680 685 690 Thr Pro Lys Leu Met Val Pro Arg Glu Phe Lys Pro ThrLeu Pro 695 700 705 Gly Arg Gly Trp Leu Val Ser Pro Phe Gly Ala Asn ProTrp Trp 710 715 720 Trp Ser Val Ala Ala Ala Leu Pro Ala Leu Leu Leu SerIle Leu 725 730 735 Ile Phe Met Asp Gln Gln Ile Thr Ala Val Ile Leu AsnArg Met 740 745 750 Glu Tyr Arg Leu Gln Lys Gly Ala Gly Phe His Leu AspLeu Phe 755 760 765 Cys Val Ala Val Leu Met Leu Leu Thr Ser Ala Leu GlyLeu Pro 770 775 780 Trp Tyr Val Ser Ala Thr Val Ile Ser Leu Ala His MetAsp Ser 785 790 795 Leu Arg Arg Glu Ser Arg Ala Cys Ala Pro Gly Glu ArgPro Asn 800 805 810 Phe Leu Gly Ile Arg Glu Gln Arg Leu Thr Gly Leu ValVal Phe 815 820 825 Ile Leu Thr Gly Ala Ser Ile Phe Leu Ala Pro Val LeuLys Phe 830 835 840 Ile Pro Met Pro Val Leu Tyr Gly Ile Phe Leu Tyr MetGly Val 845 850 855 Ala Ala Leu Ser Ser Ile Gln Phe Thr Asn Arg Val LysLeu Leu 860 865 870 Leu Met Pro Ala Lys His Gln Pro Asp Leu Leu Leu LeuArg His 875 880 885 Val Pro Leu Thr Arg Val His Leu Phe Thr Ala Ile GlnLeu Ala 890 895 900 Cys Leu Gly Leu Leu Trp Ile Ile Lys Ser Thr Pro AlaAla Ile 905 910 915 Ile Phe Pro Leu Met Leu Leu Gly Leu Val Gly Val ArgLys Ala 920 925 930 Leu Glu Arg Val Phe Ser Pro Gln Glu Leu Leu Trp LeuAsp Glu 935 940 945 Leu Met Pro Glu Glu Glu Arg Ser Ile Pro Glu Lys GlyLeu Glu 950 955 960 Pro Glu His Ser Phe Ser Gly Ser Asp Ser Glu Asp SerGlu Leu 965 970 975 Met Tyr Gln Pro Lys Ala Pro Glu Ile Asn Ile Ser ValAsn 980 985 4 505 PRT Homo sapiens misc_feature Incyte ID No 7483601CD14 Met Asp His Ala Glu Glu Asn Glu Ile Leu Ala Ala Thr Gln Arg 1 5 10 15Tyr Tyr Val Glu Arg Pro Ile Phe Ser His Pro Val Leu Gln Glu 20 25 30 ArgLeu His Thr Lys Asp Lys Val Pro Asp Ser Ile Ala Asp Lys 35 40 45 Leu LysGln Ala Phe Thr Cys Thr Pro Lys Lys Ile Arg Asn Ile 50 55 60 Ile Tyr MetPhe Leu Pro Ile Thr Lys Trp Leu Pro Ala Tyr Lys 65 70 75 Phe Lys Glu TyrVal Leu Gly Asp Leu Val Ser Gly Ile Ser Thr 80 85 90 Gly Val Leu Gln LeuPro Gln Gly Leu Ala Phe Ala Met Leu Ala 95 100 105 Ala Val Pro Pro IlePhe Gly Leu Tyr Pro Ser Phe Tyr Pro Val 110 115 120 Ile Met Tyr Cys PheLeu Gly Thr Ser Arg His Ile Ser Ile Gly 125 130 135 Pro Phe Ala Val IleSer Leu Met Ile Gly Gly Val Ala Val Arg 140 145 150 Leu Val Pro Asp AspIle Val Ile Pro Gly Gly Val Asn Ala Thr 155 160 165 Asn Gly Thr Glu AlaArg Asp Ala Leu Arg Val Lys Val Ala Met 170 175 180 Ser Val Thr Leu LeuSer Gly Ile Ile Gln Phe Cys Leu Gly Val 185 190 195 Cys Arg Phe Gly PheVal Ala Ile Tyr Leu Thr Glu Pro Leu Val 200 205 210 Arg Gly Phe Thr ThrAla Ala Ala Val His Val Phe Thr Ser Met 215 220 225 Leu Lys Tyr Leu PheGly Val Lys Thr Lys Arg Tyr Ser Gly Ile 230 235 240 Phe Ser Val Val TyrSer Thr Val Ala Val Leu Gln Asn Val Lys 245 250 255 Asn Leu Asn Val CysSer Leu Gly Val Gly Leu Met Val Phe Gly 260 265 270 Leu Leu Leu Gly GlyLys Glu Phe Asn Glu Arg Phe Lys Glu Lys 275 280 285 Leu Pro Ala Pro IlePro Leu Glu Phe Phe Ala Val Val Met Gly 290 295 300 Thr Gly Ile Ser AlaGly Phe Asn Leu Lys Glu Ser Tyr Asn Val 305 310 315 Asp Val Val Gly ThrLeu Pro Leu Gly Leu Leu Pro Pro Ala Asn 320 325 330 Pro Asp Thr Ser LeuPhe His Leu Val Tyr Val Asp Ala Ile Ala 335 340 345 Ile Ala Ile Val GlyPhe Ser Val Thr Ile Ser Met Ala Lys Thr 350 355 360 Leu Ala Asn Lys HisGly Tyr Gln Val Asp Gly Asn Gln Glu Leu 365 370 375 Ile Ala Leu Gly LeuCys Asn Ser Ile Gly Ser Leu Phe Gln Thr 380 385 390 Phe Ser Ile Ser CysSer Leu Ser Arg Ser Leu Val Gln Glu Gly 395 400 405 Thr Gly Gly Lys ThrGln Leu Ala Gly Cys Leu Ala Ser Leu Met 410 415 420 Ile Leu Leu Val IleLeu Ala Thr Gly Phe Leu Phe Glu Ser Leu 425 430 435 Pro Gln Ala Val LeuSer Ala Ile Val Ile Val Asn Leu Lys Gly 440 445 450 Met Phe Met Gln PheSer Asp Leu Pro Phe Phe Trp Arg Thr Ser 455 460 465 Lys Ile Glu Leu ThrIle Trp Leu Thr Thr Phe Val Ser Ser Leu 470 475 480 Phe Leu Gly Leu AspTyr Gly Leu Ile Thr Ala Val Ile Ile Ala 485 490 495 Leu Leu Thr Val IleTyr Arg Thr Gln Arg 500 505 5 618 PRT Homo sapiens misc_feature IncyteID No 7487851CD1 5 Met Ser Arg Ser Pro Leu Asn Pro Ser Gln Leu Arg SerVal Gly 1 5 10 15 Ser Gln Asp Ala Leu Ala Pro Leu Pro Pro Pro Ala ProGln Asn 20 25 30 Pro Ser Thr His Ser Trp Asp Pro Leu Cys Gly Ser Leu ProTrp 35 40 45 Gly Leu Ser Cys Leu Leu Ala Leu Gln His Val Leu Val Met Ala50 55 60 Ser Leu Leu Cys Val Ser His Leu Leu Leu Leu Cys Ser Leu Ser 6570 75 Pro Gly Gly Leu Ser Tyr Ser Pro Ser Gln Leu Leu Ala Ser Ser 80 8590 Phe Phe Ser Arg Gly Met Ser Thr Ile Leu Gln Thr Trp Met Gly 95 100105 Ser Arg Leu Pro Leu Val Gln Ala Pro Ser Leu Glu Phe Leu Ile 110 115120 Pro Ala Leu Val Leu Thr Ser Gln Lys Leu Pro Arg Ala Ile Gln 125 130135 Thr Pro Gly Asn Cys Glu His Arg Ala Arg Ala Arg Ala Ser Leu 140 145150 Met Leu His Leu Cys Arg Gly Pro Ser Cys His Gly Leu Gly His 155 160165 Trp Asn Thr Ser Leu Gln Glu Val Ser Gly Ala Val Val Val Ser 170 175180 Gly Leu Leu Gln Gly Met Met Gly Leu Leu Gly Ser Pro Gly His 185 190195 Val Phe Pro His Cys Gly Pro Leu Val Leu Ala Pro Ser Leu Val 200 205210 Val Ala Gly Leu Ser Ala His Arg Glu Val Ala Gln Phe Cys Phe 215 220225 Thr His Trp Gly Leu Ala Leu Leu Val Ile Leu Leu Met Val Val 230 235240 Cys Ser Gln His Leu Gly Ser Cys Gln Phe His Val Cys Pro Trp 245 250255 Arg Arg Ala Ser Thr Ser Ser Thr His Thr Pro Leu Pro Val Phe 260 265270 Arg Leu Leu Ser Val Leu Ile Pro Val Ala Cys Val Trp Ile Val 275 280285 Ser Ala Phe Val Gly Phe Ser Val Ile Pro Gln Glu Leu Ser Ala 290 295300 Pro Thr Lys Ala Pro Trp Ile Trp Leu Pro His Pro Gly Glu Trp 305 310315 Asn Trp Pro Leu Leu Thr Pro Arg Ala Leu Ala Ala Gly Ile Ser 320 325330 Met Ala Leu Ala Ala Ser Thr Ser Ser Leu Gly Cys Tyr Ala Leu 335 340345 Cys Gly Arg Leu Leu His Leu Pro Pro Pro Pro Pro His Ala Cys 350 355360 Ser Arg Gly Leu Ser Leu Glu Gly Leu Gly Ser Val Leu Ala Gly 365 370375 Leu Leu Gly Ser Pro Met Gly Thr Ala Ser Ser Phe Pro Asn Val 380 385390 Gly Lys Val Gly Leu Ile Gln Ala Gly Ser Gln Gln Val Ala His 395 400405 Leu Val Gly Leu Leu Cys Val Gly Leu Gly Leu Ser Pro Arg Leu 410 415420 Ala Gln Leu Leu Thr Thr Ile Pro Leu Pro Val Val Gly Gly Val 425 430435 Leu Gly Val Thr Gln Ala Val Val Leu Ser Ala Gly Phe Ser Ser 440 445450 Phe Tyr Leu Ala Asp Ile Asp Ser Gly Arg Asn Ile Phe Ile Val 455 460465 Gly Phe Ser Ile Phe Met Ala Leu Leu Leu Pro Arg Trp Phe Arg 470 475480 Glu Ala Pro Val Leu Phe Ser Thr Gly Trp Ser Pro Leu Asp Val 485 490495 Leu Leu His Ser Leu Leu Thr Gln Pro Ile Phe Leu Ala Gly Leu 500 505510 Ser Gly Phe Leu Leu Glu Asn Thr Ile Pro Gly Thr Gln Leu Glu 515 520525 Arg Gly Leu Gly Gln Gly Leu Pro Ser Pro Phe Thr Ala Gln Glu 530 535540 Ala Arg Met Pro Gln Lys Pro Arg Glu Lys Ala Ala Gln Val Tyr 545 550555 Arg Leu Pro Phe Pro Ile Gln Asn Leu Cys Pro Cys Ile Pro Gln 560 565570 Pro Leu His Cys Leu Cys Pro Leu Pro Glu Asp Pro Gly Asp Glu 575 580585 Glu Gly Gly Ser Ser Glu Pro Glu Glu Met Ala Asp Leu Leu Pro 590 595600 Gly Ser Gly Glu Pro Cys Pro Glu Ser Ser Arg Glu Gly Phe Arg 605 610615 Ser Gln Lys 6 377 PRT Homo sapiens misc_feature Incyte ID No7472881CD1 6 Met Arg Ala Asn Cys Ser Ser Ser Ser Ala Cys Pro Ala Asn Ser1 5 10 15 Ser Glu Glu Glu Leu Pro Val Gly Leu Glu Ala His Gly Asn Leu 2025 30 Glu Leu Val Phe Thr Val Val Pro Thr Val Met Met Gly Leu Leu 35 4045 Met Phe Ser Leu Gly Cys Ser Val Glu Ile Arg Lys Leu Trp Ser 50 55 60His Ile Arg Arg Pro Trp Gly Ile Ala Val Gly Leu Leu Cys Gln 65 70 75 PheGly Leu Met Pro Phe Thr Ala Tyr Leu Leu Ala Ile Ser Phe 80 85 90 Ser LeuLys Pro Val Gln Ala Ile Ala Val Leu Ile Met Gly Cys 95 100 105 Cys ProGly Gly Thr Ile Ser Asn Ile Phe Thr Phe Trp Val Asp 110 115 120 Gly AspMet Asp Leu Ser Ile Ser Met Thr Thr Cys Ser Thr Val 125 130 135 Ala AlaLeu Gly Met Met Pro Leu Cys Ile Tyr Leu Tyr Thr Trp 140 145 150 Ser TrpSer Leu Gln Gln Asn Leu Thr Ile Pro Tyr Gln Asn Ile 155 160 165 Gly IleThr Leu Val Cys Leu Thr Ile Pro Val Ala Phe Gly Val 170 175 180 Tyr ValAsn Tyr Arg Trp Pro Lys Gln Ser Lys Ile Ile Leu Lys 185 190 195 Ile GlyAla Val Val Gly Gly Val Leu Leu Leu Val Val Ala Val 200 205 210 Ala GlyVal Val Leu Ala Lys Gly Ser Trp Asn Ser Asp Ile Thr 215 220 225 Leu LeuThr Ile Ser Phe Ile Phe Pro Leu Ile Gly His Val Thr 230 235 240 Gly PheLeu Leu Ala Leu Phe Thr His Gln Ser Trp Gln Arg Cys 245 250 255 Arg ThrIle Ser Leu Glu Thr Gly Ala Gln Asn Ile Gln Met Cys 260 265 270 Ile ThrMet Leu Gln Leu Ser Phe Thr Ala Glu His Leu Val Gln 275 280 285 Met LeuSer Phe Pro Leu Ala Tyr Gly Leu Phe Gln Leu Ile Asp 290 295 300 Gly PheLeu Ile Val Ala Ala Tyr Gln Thr Tyr Lys Arg Arg Leu 305 310 315 Lys AsnLys His Gly Lys Lys Asn Ser Gly Cys Thr Glu Val Cys 320 325 330 His ThrArg Lys Ser Thr Ser Ser Arg Glu Thr Asn Ala Phe Leu 335 340 345 Glu ValAsn Glu Glu Gly Ala Ile Thr Pro Gly Pro Pro Gly Pro 350 355 360 Met AspCys His Arg Ala Leu Glu Pro Val Gly His Ile Thr Ser 365 370 375 Cys Glu7 507 PRT Homo sapiens misc_feature Incyte ID No 7612560CD1 7 Met SerVal Thr Lys Ser Thr Glu Gly Pro Gln Gly Ala Val Ala 1 5 10 15 Ile LysLeu Asp Leu Met Ser Pro Pro Glu Ser Ala Lys Lys Leu 20 25 30 Glu Asn LysAsp Ser Thr Phe Leu Asp Glu Ser Pro Ser Glu Ser 35 40 45 Ala Gly Leu LysLys Thr Lys Gly Ile Thr Val Phe Gln Ala Leu 50 55 60 Ile His Leu Val LysGly Asn Met Gly Thr Gly Ile Leu Gly Leu 65 70 75 Pro Leu Ala Val Lys AsnAla Gly Ile Leu Met Gly Pro Leu Ser 80 85 90 Leu Leu Val Met Gly Phe IleAla Cys His Cys Met His Ile Leu 95 100 105 Val Lys Cys Ala Gln Arg PheCys Lys Arg Leu Asn Lys Pro Phe 110 115 120 Met Asp Tyr Gly Asp Thr ValMet His Gly Leu Glu Ala Asn Pro 125 130 135 Asn Ala Trp Leu Gln Asn HisAla His Trp Gly Arg His Ile Val 140 145 150 Ser Phe Phe Leu Ile Ile ThrGln Leu Gly Phe Cys Cys Val Tyr 155 160 165 Ile Val Phe Leu Ala Asp AsnLeu Lys Gln Val Val Glu Ala Val 170 175 180 Asn Ser Thr Thr Asn Asn CysTyr Ser Asn Glu Thr Val Ile Leu 185 190 195 Thr Pro Thr Met Asp Ser ArgLeu Tyr Met Leu Ser Phe Leu Pro 200 205 210 Phe Leu Val Leu Leu Val LeuIle Arg Asn Leu Arg Ile Leu Thr 215 220 225 Ile Phe Ser Met Leu Ala AsnIle Ser Met Leu Val Ser Leu Val 230 235 240 Ile Ile Ile Gln Tyr Ile ThrGln Glu Ile Pro Asp Pro Ser Arg 245 250 255 Leu Pro Leu Val Ala Ser TrpLys Thr Tyr Pro Leu Phe Phe Gly 260 265 270 Thr Ala Ile Phe Ser Phe GluSer Ile Gly Val Val Leu Pro Leu 275 280 285 Glu Asn Lys Met Lys Asn AlaArg His Phe Pro Ala Ile Leu Ser 290 295 300 Leu Gly Met Ser Ile Val ThrSer Leu Tyr Ile Gly Met Ala Ala 305 310 315 Leu Gly Tyr Leu Arg Phe GlyAsp Asp Ile Lys Ala Ser Ile Ser 320 325 330 Leu Asn Leu Pro Asn Cys TrpLeu Tyr Gln Ser Val Lys Leu Leu 335 340 345 Tyr Ile Ala Gly Ile Leu CysThr Tyr Ala Leu Gln Phe Tyr Val 350 355 360 Pro Ala Glu Ile Ile Ile ProPhe Ala Ile Ser Arg Val Ser Thr 365 370 375 Arg Trp Ala Leu Pro Leu AspLeu Ser Ile Arg Leu Val Met Val 380 385 390 Cys Leu Thr Cys Leu Leu AlaIle Leu Ile Pro Arg Leu Asp Leu 395 400 405 Val Ile Ser Leu Val Gly SerVal Ser Gly Thr Ala Leu Ala Leu 410 415 420 Ile Ile Pro Pro Leu Leu GluVal Thr Thr Phe Tyr Ser Glu Gly 425 430 435 Met Ser Pro Leu Thr Ile PheLys Asp Val Leu Ile Ser Ile Leu 440 445 450 Gly Phe Val Gly Phe Val ValGly Thr Tyr Gln Ala Leu Asp Glu 455 460 465 Leu Leu Lys Ser Glu Asp SerHis Pro Phe Ser Asn Ser Thr Thr 470 475 480 Phe Val Arg Val Glu Leu CysLys Lys Gln Pro Pro Glu Gly Pro 485 490 495 Lys Trp Gln Gln Leu Ala LysGly Asp Ala Ala Ser 500 505 8 438 PRT Homo sapiens misc_feature IncyteID No 2880370CD1 8 Met Ile Arg Lys Leu Phe Ile Val Leu Leu Leu Leu LeuVal Thr 1 5 10 15 Ile Glu Glu Ala Arg Met Ser Ser Leu Ser Phe Leu AsnIle Glu 20 25 30 Lys Thr Glu Ile Leu Phe Phe Thr Lys Thr Glu Glu Thr IleLeu 35 40 45 Val Ser Ser Ser Tyr Glu Asn Lys Arg Pro Asn Ser Ser His Leu50 55 60 Phe Val Lys Ile Glu Asp Pro Lys Ile Leu Gln Met Val Asn Val 6570 75 Ala Lys Lys Ile Ser Ser Asp Ala Thr Asn Phe Thr Ile Asn Leu 80 8590 Val Thr Asp Glu Glu Gly Glu Thr Asn Val Thr Ile Gln Leu Trp 95 100105 Asp Ser Glu Gly Arg Gln Glu Arg Leu Ile Glu Glu Ile Lys Asn 110 115120 Val Lys Val Lys Val Leu Lys Gln Lys Asp Ser Leu Leu Gln Ala 125 130135 Pro Met His Ile Asp Arg Asn Ile Leu Met Leu Ile Leu Pro Leu 140 145150 Ile Leu Leu Asn Lys Cys Ala Phe Gly Cys Lys Ile Glu Leu Gln 155 160165 Leu Phe Gln Thr Val Trp Lys Arg Pro Leu Pro Val Ile Leu Gly 170 175180 Ala Val Thr Gln Phe Phe Leu Met Pro Phe Cys Gly Phe Leu Leu 185 190195 Ser Gln Ile Val Ala Leu Pro Glu Ala Gln Ala Phe Gly Val Val 200 205210 Met Thr Cys Thr Cys Pro Gly Gly Gly Gly Gly Tyr Leu Phe Ala 215 220225 Leu Leu Leu Asp Gly Asp Phe Thr Leu Ala Ile Leu Met Thr Cys 230 235240 Thr Ser Thr Leu Leu Ala Leu Ile Met Met Pro Val Asn Ser Tyr 245 250255 Ile Tyr Ser Arg Ile Leu Gly Leu Ser Gly Thr Phe His Ile Pro 260 265270 Val Ser Lys Ile Val Ser Thr Leu Leu Phe Ile Leu Val Pro Val 275 280285 Ser Ile Gly Ile Val Ile Lys His Arg Ile Pro Glu Lys Ala Ser 290 295300 Phe Leu Glu Arg Ile Ile Arg Pro Leu Ser Phe Ile Leu Met Phe 305 310315 Val Gly Ile Tyr Leu Thr Phe Thr Val Gly Leu Val Phe Leu Lys 320 325330 Thr Asp Asn Leu Glu Val Ile Leu Leu Gly Leu Leu Val Pro Ala 335 340345 Leu Gly Leu Leu Phe Gly Tyr Ser Phe Ala Lys Val Cys Thr Leu 350 355360 Pro Leu Pro Val Cys Lys Thr Val Ala Ile Glu Ser Gly Met Leu 365 370375 Asn Ser Phe Leu Ala Leu Ala Val Ile Gln Leu Ser Phe Pro Gln 380 385390 Ser Lys Ala Asn Leu Ala Ser Val Ala Pro Phe Thr Val Ala Met 395 400405 Cys Ser Gly Cys Glu Met Leu Leu Ile Ile Leu Val Tyr Lys Ala 410 415420 Lys Lys Arg Cys Ile Phe Phe Leu Gln Asp Lys Arg Lys Arg Asn 425 430435 Phe Leu Ile 9 350 PRT Homo sapiens misc_feature Incyte ID No6267489CD1 9 Met Leu Glu Gly Ala Glu Leu Tyr Phe Asn Val Asp His Gly Tyr1 5 10 15 Leu Glu Gly Leu Val Arg Gly Cys Lys Ala Ser Leu Leu Thr Gln 2025 30 Gln Asp Tyr Ile Asn Leu Val Gln Cys Glu Thr Leu Glu Asp Leu 35 4045 Lys Ile His Leu Gln Thr Thr Asp Tyr Gly Asn Phe Leu Ala Asn 50 55 60His Thr Asn Pro Leu Thr Val Ser Lys Ile Asp Thr Glu Met Arg 65 70 75 LysArg Leu Cys Gly Glu Phe Glu Tyr Phe Arg Asn His Ser Leu 80 85 90 Glu ProLeu Ser Thr Phe Leu Thr Tyr Met Thr Cys Ser Tyr Met 95 100 105 Ile AspAsn Val Ile Leu Leu Met Asn Gly Ala Leu Gln Lys Lys 110 115 120 Ser ValLys Glu Ile Leu Gly Lys Cys His Pro Leu Gly Arg Phe 125 130 135 Thr GluMet Glu Ala Val Asn Ile Ala Glu Thr Pro Ser Asp Leu 140 145 150 Phe AsnAla Ile Leu Ile Glu Thr Pro Leu Ala Pro Phe Phe Gln 155 160 165 Asp CysMet Ser Glu Asn Ala Leu Asp Glu Leu Asn Ile Glu Leu 170 175 180 Leu ArgAsn Lys Leu Tyr Lys Ser Tyr Leu Glu Ala Phe Tyr Lys 185 190 195 Phe CysLys Asn His Gly Asp Val Thr Ala Glu Val Met Cys Pro 200 205 210 Ile LeuGlu Phe Glu Ala Asp Arg Arg Ala Phe Ile Ile Thr Leu 215 220 225 Asn SerPhe Gly Thr Glu Leu Ser Lys Glu Asp Arg Glu Thr Leu 230 235 240 Tyr ProThr Phe Gly Lys Leu Tyr Pro Glu Gly Leu Arg Leu Leu 245 250 255 Ala GlnAla Glu Asp Phe Asp Gln Met Lys Asn Val Ala Asp His 260 265 270 Tyr GlyVal Tyr Lys Pro Leu Phe Glu Ala Val Gly Gly Ser Gly 275 280 285 Gly LysThr Leu Glu Asp Val Phe Tyr Glu Arg Glu Val Gln Met 290 295 300 Asn ValLeu Ala Phe Asn Arg Gln Phe His Tyr Gly Val Phe Tyr 305 310 315 Ala TyrVal Lys Leu Lys Glu Gln Glu Ile Arg Asn Ile Val Trp 320 325 330 Ile AlaGlu Cys Ile Ser Gln Arg His Arg Thr Lys Ile Asn Ser 335 340 345 Tyr IlePro Ile Leu 350 10 1707 PRT Homo sapiens misc_feature Incyte ID No7484777CD1 10 Met Pro Glu Pro Trp Gly Thr Val Tyr Phe Leu Gly Ile AlaGln 1 5 10 15 Val Phe Ser Phe Leu Phe Ser Trp Trp Asn Leu Glu Gly ValMet 20 25 30 Asn Gln Ala Asp Ala Pro Arg Pro Leu Asn Trp Thr Ile Arg Lys35 40 45 Leu Cys His Ala Ala Phe Leu Pro Ser Val Arg Leu Leu Lys Ala 5055 60 Gln Lys Ser Trp Ile Glu Arg Ala Phe Tyr Lys Arg Glu Cys Val 65 7075 His Ile Ile Pro Ser Thr Lys Asp Pro His Arg Cys Cys Cys Gly 80 85 90Arg Leu Ile Gly Gln His Val Gly Leu Thr Pro Ser Ile Ser Val 95 100 105Leu Gln Asn Glu Lys Asn Glu Ser Arg Leu Ser Arg Asn Asp Ile 110 115 120Gln Ser Glu Lys Trp Ser Ile Ser Lys His Thr Gln Leu Ser Pro 125 130 135Thr Asp Ala Phe Gly Thr Ile Glu Phe Gln Gly Gly Gly His Ser 140 145 150Asn Lys Ala Met Tyr Val Arg Val Ser Phe Asp Thr Lys Pro Asp 155 160 165Leu Leu Leu His Leu Met Thr Lys Glu Trp Gln Leu Glu Leu Pro 170 175 180Lys Leu Leu Ile Ser Val His Gly Gly Leu Gln Asn Phe Glu Leu 185 190 195Gln Pro Lys Leu Lys Gln Val Phe Gly Lys Gly Leu Ile Lys Ala 200 205 210Ala Met Thr Thr Gly Ala Trp Ile Phe Thr Gly Gly Val Asn Thr 215 220 225Gly Val Ile Arg His Val Gly Asp Ala Leu Lys Asp His Ala Ser 230 235 240Lys Ser Arg Gly Lys Ile Cys Thr Ile Gly Ile Ala Pro Trp Gly 245 250 255Ile Val Glu Asn Gln Glu Asp Leu Ile Gly Arg Asp Val Val Arg 260 265 270Pro Tyr Gln Thr Met Ser Asn Pro Met Ser Lys Leu Thr Val Leu 275 280 285Asn Ser Met His Ser His Phe Ile Leu Ala Asp Asn Gly Thr Thr 290 295 300Gly Lys Tyr Gly Ala Glu Val Lys Leu Arg Arg Gln Leu Glu Lys 305 310 315His Ile Ser Leu Gln Lys Ile Asn Thr Arg Ile Gly Gln Gly Val 320 325 330Pro Val Val Ala Leu Ile Val Glu Gly Gly Pro Asn Val Ile Ser 335 340 345Ile Val Leu Glu Tyr Leu Arg Asp Thr Pro Pro Val Pro Val Val 350 355 360Val Cys Asp Gly Ser Gly Arg Ala Ser Asp Ile Leu Ala Phe Gly 365 370 375His Lys Tyr Ser Glu Glu Gly Gly Leu Ile Asn Glu Ser Leu Arg 380 385 390Asp Gln Leu Leu Val Thr Ile Gln Lys Thr Phe Thr Tyr Thr Arg 395 400 405Thr Gln Ala Gln His Leu Phe Ile Ile Leu Met Glu Cys Met Lys 410 415 420Lys Lys Glu Leu Ile Thr Val Phe Arg Met Gly Ser Glu Gly His 425 430 435Gln Asp Ile Asp Leu Ala Ile Leu Thr Ala Leu Leu Lys Gly Ala 440 445 450Asn Ala Ser Ala Pro Asp Gln Leu Ser Leu Ala Leu Ala Trp Asn 455 460 465Arg Val Asp Ile Ala Arg Ser Gln Ile Phe Ile Tyr Gly Gln Gln 470 475 480Trp Pro Val Gly Ser Leu Glu Gln Ala Met Leu Asp Ala Leu Val 485 490 495Leu Asp Arg Val Asp Phe Val Lys Leu Leu Ile Glu Asn Gly Val 500 505 510Ser Met His Arg Phe Leu Thr Ile Ser Arg Leu Glu Glu Leu Tyr 515 520 525Asn Thr Arg His Gly Pro Ser Asn Thr Leu Tyr His Leu Val Arg 530 535 540Asp Val Lys Lys Gly Asn Leu Pro Pro Asp Tyr Arg Ile Ser Leu 545 550 555Ile Asp Ile Gly Leu Val Ile Glu Tyr Leu Met Gly Gly Ala Tyr 560 565 570Arg Cys Asn Tyr Thr Arg Lys Arg Phe Arg Thr Leu Tyr His Asn 575 580 585Leu Phe Gly Pro Lys Arg Pro Lys Ala Leu Lys Leu Leu Gly Met 590 595 600Glu Asp Asp Ile Pro Leu Arg Arg Gly Arg Lys Thr Thr Lys Lys 605 610 615Arg Glu Glu Glu Val Asp Ile Asp Leu Asp Asp Pro Glu Ile Asn 620 625 630His Phe Pro Phe Pro Phe His Glu Leu Met Val Trp Ala Val Leu 635 640 645Met Lys Arg Gln Lys Met Ala Leu Phe Phe Trp Gln His Gly Glu 650 655 660Glu Ala Met Ala Lys Ala Leu Val Ala Cys Lys Leu Cys Lys Ala 665 670 675Met Ala His Glu Ala Ser Glu Asn Asp Met Val Asp Asp Ile Ser 680 685 690Gln Glu Leu Asn His Asn Ser Arg Asp Phe Gly Gln Leu Ala Val 695 700 705Glu Leu Leu Asp Gln Ser Tyr Lys Gln Asp Glu Gln Leu Ala Met 710 715 720Lys Leu Leu Thr Tyr Glu Leu Lys Asn Trp Ser Asn Ala Thr Cys 725 730 735Leu Gln Leu Ala Val Ala Ala Lys His Arg Asp Phe Ile Ala His 740 745 750Thr Cys Ser Gln Met Leu Leu Thr Asp Met Trp Met Gly Arg Leu 755 760 765Arg Met Arg Lys Asn Ser Gly Leu Lys Val Ile Leu Gly Ile Leu 770 775 780Leu Pro Pro Ser Ile Leu Ser Leu Glu Phe Lys Asn Lys Asp Asp 785 790 795Met Pro Tyr Met Ser Gln Ala Gln Glu Ile His Leu Gln Glu Lys 800 805 810Glu Ala Glu Glu Pro Glu Lys Pro Thr Lys Glu Lys Glu Glu Glu 815 820 825Asp Met Glu Leu Thr Ala Met Leu Gly Arg Asn Asn Gly Glu Ser 830 835 840Ser Arg Lys Lys Asp Glu Glu Glu Val Gln Ser Lys His Arg Leu 845 850 855Ile Pro Leu Gly Arg Lys Ile Tyr Glu Phe Tyr Asn Ala Pro Ile 860 865 870Val Lys Phe Trp Phe Tyr Thr Leu Ala Tyr Ile Gly Tyr Leu Met 875 880 885Leu Phe Asn Tyr Ile Val Leu Val Lys Met Glu Arg Trp Pro Ser 890 895 900Thr Gln Glu Trp Ile Val Ile Ser Tyr Ile Phe Thr Leu Gly Ile 905 910 915Glu Lys Met Arg Glu Ile Leu Met Ser Glu Pro Gly Lys Leu Leu 920 925 930Gln Lys Val Lys Val Trp Leu Gln Glu Tyr Trp Asn Val Thr Asp 935 940 945Leu Ile Ala Ile Leu Leu Phe Ser Val Gly Met Ile Leu Arg Leu 950 955 960Gln Asp Gln Pro Phe Arg Ser Asp Gly Arg Val Ile Tyr Cys Val 965 970 975Asn Ile Ile Tyr Trp Tyr Ile Arg Leu Leu Asp Ile Phe Gly Val 980 985 990Asn Lys Tyr Leu Gly Pro Tyr Val Met Met Ile Gly Lys Met Met 995 10001005 Ile Asp Met Met Tyr Phe Val Ile Ile Met Leu Val Val Leu Met 10101015 1020 Ser Phe Gly Val Ala Arg Gln Ala Ile Leu Phe Pro Asn Glu Glu1025 1030 1035 Pro Ser Trp Lys Leu Ala Lys Asn Ile Phe Tyr Met Pro TyrTrp 1040 1045 1050 Met Ile Tyr Gly Glu Val Phe Ala Asp Gln Ile Asp ProPro Cys 1055 1060 1065 Gly Gln Asn Glu Thr Arg Glu Asp Gly Lys Ile IleGln Leu Pro 1070 1075 1080 Pro Cys Lys Thr Gly Ala Trp Ile Val Pro AlaIle Met Ala Cys 1085 1090 1095 Tyr Leu Leu Val Ala Asn Ile Leu Leu ValAsn Leu Leu Ile Ala 1100 1105 1110 Val Phe Asn Asn Thr Phe Phe Glu ValLys Ser Ile Ser Asn Gln 1115 1120 1125 Val Trp Lys Phe Gln Arg Tyr GlnLeu Ile Met Thr Phe His Glu 1130 1135 1140 Arg Pro Val Leu Pro Pro ProLeu Ile Ile Phe Ser His Met Thr 1145 1150 1155 Met Ile Phe Gln His LeuCys Cys Arg Trp Arg Lys His Glu Ser 1160 1165 1170 Asp Pro Asp Glu ArgAsp Tyr Gly Leu Lys Leu Phe Ile Thr Asp 1175 1180 1185 Asp Glu Leu LysLys Val His Asp Phe Glu Glu Gln Cys Ile Glu 1190 1195 1200 Glu Tyr PheArg Glu Lys Asp Asp Arg Phe Asn Ser Ser Asn Asp 1205 1210 1215 Glu ArgIle Arg Val Thr Ser Glu Arg Val Glu Asn Met Ser Met 1220 1225 1230 ArgLeu Glu Glu Val Asn Glu Arg Glu His Ser Met Lys Ala Ser 1235 1240 1245Leu Gln Thr Val Asp Ile Arg Leu Ala Gln Leu Glu Asp Leu Ile 1250 12551260 Gly Arg Met Ala Thr Ala Leu Glu Arg Leu Thr Gly Leu Glu Arg 12651270 1275 Ala Glu Ser Asn Lys Ile Arg Ser Arg Thr Ser Ser Asp Cys Thr1280 1285 1290 Asp Ala Ala Tyr Ile Val Arg Gln Ser Ser Phe Asn Ser GlnGlu 1295 1300 1305 Gly Asn Thr Phe Lys Leu Gln Glu Ser Ile Asp Pro AlaGly Glu 1310 1315 1320 Glu Thr Met Ser Pro Thr Ser Pro Thr Leu Met ProArg Met Arg 1325 1330 1335 Ser His Ser Phe Tyr Ser Val Asn Met Lys AspLys Gly Gly Ile 1340 1345 1350 Glu Lys Leu Glu Ser Ile Phe Lys Glu ArgSer Leu Ser Leu His 1355 1360 1365 Arg Ala Thr Ser Ser His Ser Val AlaLys Glu Pro Lys Ala Pro 1370 1375 1380 Ala Ala Pro Ala Asn Thr Leu AlaIle Val Pro Asp Ser Arg Arg 1385 1390 1395 Pro Ser Ser Cys Ile Asp IleTyr Val Ser Ala Met Asp Glu Leu 1400 1405 1410 His Cys Asp Ile Asp ProLeu Asp Asn Ser Val Asn Ile Leu Gly 1415 1420 1425 Leu Gly Glu Pro SerPhe Ser Thr Pro Val Pro Ser Thr Ala Pro 1430 1435 1440 Ser Ser Ser AlaTyr Ala Thr Leu Ala Pro Thr Asp Arg Pro Pro 1445 1450 1455 Ser Arg SerIle Asp Phe Glu Asp Ile Thr Ser Met Asp Thr Arg 1460 1465 1470 Ser PheSer Ser Asp Tyr Thr His Leu Pro Glu Cys Gln Asn Pro 1475 1480 1485 TrpAsp Ser Glu Pro Pro Met Tyr His Thr Ile Glu Arg Ser Lys 1490 1495 1500Ser Ser Arg Tyr Leu Ala Thr Thr Pro Phe Leu Leu Glu Glu Ala 1505 15101515 Pro Ile Val Lys Ser His Ser Phe Met Phe Ser Pro Ser Arg Ser 15201525 1530 Tyr Tyr Ala Asn Phe Gly Val Pro Val Lys Thr Ala Glu Tyr Thr1535 1540 1545 Ser Ile Thr Asp Cys Ile Asp Thr Arg Cys Val Asn Ala ProGln 1550 1555 1560 Ala Ile Ala Asp Arg Ala Ala Phe Pro Gly Gly Leu GlyAsp Lys 1565 1570 1575 Val Glu Asp Leu Thr Cys Cys His Pro Glu Arg GluAla Glu Leu 1580 1585 1590 Ser His Pro Ser Ser Asp Ser Glu Glu Asn GluAla Lys Gly Arg 1595 1600 1605 Arg Ala Thr Ile Ala Ile Ser Ser Gln GluGly Asp Asn Ser Glu 1610 1615 1620 Arg Thr Leu Ser Asn Asn Ile Thr ValPro Lys Ile Glu Arg Ala 1625 1630 1635 Asn Ser Tyr Ser Ala Glu Glu ProSer Ala Pro Tyr Ala His Thr 1640 1645 1650 Arg Lys Ser Phe Ser Ile SerAsp Lys Leu Asp Arg Gln Arg Asn 1655 1660 1665 Thr Ala Ser Leu Arg AsnPro Phe Gln Arg Ser Lys Ser Ser Lys 1670 1675 1680 Pro Glu Gly Arg GlyAsp Ser Leu Ser Met Arg Lys Leu Ser Arg 1685 1690 1695 Thr Ser Ala PheGln Ser Phe Glu Ser Lys His Thr 1700 1705 11 771 PRT Homo sapiensmisc_feature Incyte ID No 2493969CD1 11 Met Ser Gly Phe Phe Thr Ser LeuAsp Pro Arg Arg Val Gln Trp 1 5 10 15 Gly Ala Ala Trp Tyr Ala Met HisSer Arg Ile Leu Arg Thr Lys 20 25 30 Pro Val Glu Ser Met Leu Glu Gly ThrGly Thr Thr Thr Ala His 35 40 45 Gly Thr Lys Leu Ala Gln Val Leu Thr ThrVal Asp Leu Ile Ser 50 55 60 Leu Gly Val Gly Ser Cys Val Gly Thr Gly MetTyr Val Val Ser 65 70 75 Gly Leu Val Ala Lys Glu Met Ala Gly Pro Gly ValIle Val Ser 80 85 90 Phe Ile Ile Ala Ala Val Ala Ser Ile Leu Ser Gly ValCys Tyr 95 100 105 Ala Glu Phe Gly Val Arg Val Pro Lys Thr Thr Gly SerAla Tyr 110 115 120 Thr Tyr Ser Tyr Val Thr Val Gly Glu Phe Val Ala PhePhe Ile 125 130 135 Gly Trp Asn Leu Ile Leu Glu Tyr Leu Ile Gly Thr AlaAla Gly 140 145 150 Ala Ser Ala Leu Ser Ser Met Phe Asp Ser Leu Ala AsnHis Thr 155 160 165 Ile Ser Arg Trp Met Ala Asp Ser Val Gly Thr Leu AsnGly Leu 170 175 180 Gly Lys Gly Glu Glu Ser Tyr Pro Asp Leu Leu Ala LeuLeu Ile 185 190 195 Ala Val Ile Val Thr Ile Ile Val Ala Leu Gly Val LysAsn Ser 200 205 210 Ile Gly Phe Asn Asn Val Leu Asn Val Leu Asn Leu AlaVal Trp 215 220 225 Val Phe Ile Met Ile Ala Gly Leu Phe Phe Ile Asn GlyLys Tyr 230 235 240 Trp Ala Glu Gly Gln Phe Leu Pro His Gly Trp Ser GlyVal Leu 245 250 255 Gln Gly Ala Ala Thr Cys Phe Tyr Ala Phe Ile Gly PheAsp Ile 260 265 270 Ile Ala Thr Thr Gly Glu Glu Ala Lys Asn Pro Asn ThrSer Ile 275 280 285 Pro Tyr Ala Ile Thr Ala Ser Leu Val Ile Cys Leu ThrAla Tyr 290 295 300 Val Ser Val Ser Val Ile Leu Thr Leu Met Val Pro TyrTyr Thr 305 310 315 Ile Asp Thr Glu Ser Pro Leu Met Glu Met Phe Val AlaHis Gly 320 325 330 Phe Tyr Ala Ala Lys Phe Val Val Ala Ile Gly Ser ValAla Gly 335 340 345 Leu Thr Val Ser Leu Leu Gly Ser Leu Phe Pro Met ProArg Val 350 355 360 Ile Tyr Ala Met Ala Gly Asp Gly Leu Leu Phe Arg PheLeu Ala 365 370 375 His Val Ser Ser Tyr Thr Glu Thr Pro Val Val Ala CysIle Val 380 385 390 Ser Gly Phe Leu Ala Ala Leu Leu Ala Leu Leu Val SerLeu Arg 395 400 405 Asp Leu Ile Glu Met Met Ser Ile Gly Thr Leu Leu AlaTyr Thr 410 415 420 Leu Val Ser Val Cys Val Leu Leu Leu Arg Tyr Gln ProGlu Ser 425 430 435 Asp Ile Asp Gly Phe Val Lys Phe Leu Ser Glu Glu HisThr Lys 440 445 450 Lys Lys Glu Gly Ile Leu Ala Asp Cys Glu Lys Glu AlaCys Ser 455 460 465 Pro Val Ser Glu Gly Asp Glu Phe Ser Gly Pro Ala ThrAsn Thr 470 475 480 Cys Gly Ala Lys Asn Leu Pro Ser Leu Gly Asp Asn GluMet Leu 485 490 495 Ile Gly Lys Ser Asp Lys Ser Thr Tyr Asn Val Asn HisPro Asn 500 505 510 Tyr Gly Thr Val Asp Met Thr Thr Gly Ile Glu Ala AspGlu Ser 515 520 525 Glu Asn Ile Tyr Leu Ile Lys Leu Lys Lys Leu Ile GlyPro His 530 535 540 Tyr Tyr Thr Met Arg Ile Arg Leu Gly Leu Pro Gly LysMet Asp 545 550 555 Arg Pro Thr Ala Ala Thr Gly His Thr Val Thr Ile CysVal Leu 560 565 570 Leu Leu Phe Ile Leu Met Phe Ile Phe Cys Ser Phe IleIle Phe 575 580 585 Gly Ser Asp Tyr Ile Ser Glu Gln Ser Trp Trp Ala IleLeu Leu 590 595 600 Val Val Leu Met Val Leu Leu Ile Ser Thr Leu Val PheVal Ile 605 610 615 Leu Gln Gln Pro Glu Asn Pro Lys Lys Leu Pro Tyr MetAla Pro 620 625 630 Cys Leu Pro Phe Val Pro Ala Phe Ala Met Leu Val AsnIle Tyr 635 640 645 Leu Met Leu Lys Leu Ser Thr Ile Thr Trp Ile Arg PheAla Val 650 655 660 Trp Cys Phe Val Gly Leu Leu Ile Tyr Phe Gly Tyr GlyIle Trp 665 670 675 Asn Ser Thr Leu Glu Ile Ser Ala Arg Glu Glu Ala LeuHis Gln 680 685 690 Ser Thr Tyr Gln Arg Tyr Asp Val Asp Asp Pro Phe SerVal Glu 695 700 705 Glu Gly Phe Ser Tyr Ala Thr Glu Gly Glu Ser Gln GluAsp Trp 710 715 720 Gly Gly Pro Thr Glu Asp Lys Gly Phe Tyr Tyr Gln GlnMet Ser 725 730 735 Asp Ala Lys Ala Asn Gly Arg Thr Ser Ser Lys Ala LysSer Lys 740 745 750 Ser Lys His Lys Gln Asn Ser Glu Ala Leu Ile Ala AsnAsp Glu 755 760 765 Leu Asp Tyr Ser Pro Glu 770 12 1329 PRT Homo sapiensmisc_feature Incyte ID No 3244593CD1 12 Met Val Gly Glu Gly Pro Tyr LeuIle Ser Asp Leu Asp Gln Arg 1 5 10 15 Gly Arg Arg Arg Ser Phe Ala GluArg Tyr Asp Pro Ser Leu Lys 20 25 30 Thr Met Ile Pro Val Arg Pro Cys AlaArg Leu Ala Pro Asn Pro 35 40 45 Val Asp Asp Ala Gly Leu Leu Ser Phe AlaThr Phe Ser Trp Leu 50 55 60 Thr Pro Val Met Val Lys Gly Tyr Arg Gln ArgLeu Thr Val Asp 65 70 75 Thr Leu Pro Pro Leu Ser Thr Tyr Asp Ser Ser AspThr Asn Ala 80 85 90 Lys Arg Phe Arg Val Leu Trp Asp Glu Glu Val Ala ArgVal Gly 95 100 105 Pro Glu Lys Ala Ser Leu Ser His Val Val Trp Lys PheGln Arg 110 115 120 Thr Arg Val Leu Met Asp Ile Val Ala Asn Ile Leu CysIle Ile 125 130 135 Met Ala Ala Ile Gly Pro Thr Val Leu Ile His Gln IleLeu Gln 140 145 150 Gln Thr Glu Arg Thr Ser Gly Lys Val Trp Val Gly IleGly Leu 155 160 165 Cys Ile Ala Leu Phe Ala Thr Glu Phe Thr Lys Val PhePhe Trp 170 175 180 Ala Leu Ala Trp Ala Ile Asn Tyr Arg Thr Ala Ile ArgLeu Lys 185 190 195 Val Ala Leu Ser Thr Leu Val Phe Glu Asn Leu Val SerPhe Lys 200 205 210 Thr Leu Thr His Ile Ser Val Gly Glu Val Leu Asn IleLeu Ser 215 220 225 Ser Asp Ser Tyr Ser Leu Phe Glu Ala Ala Leu Phe CysPro Leu 230 235 240 Pro Ala Thr Ile Pro Ile Leu Met Val Phe Cys Ala AlaTyr Ala 245 250 255 Phe Phe Ile Leu Gly Pro Thr Ala Leu Ile Gly Ile SerVal Tyr 260 265 270 Val Ile Phe Ile Pro Val Gln Met Phe Met Ala Lys LeuAsn Ser 275 280 285 Ala Phe Arg Arg Ser Ala Ile Leu Val Thr Asp Lys ArgVal Gln 290 295 300 Thr Met Asn Glu Phe Leu Thr Cys Ile Arg Leu Ile LysMet Tyr 305 310 315 Ala Trp Glu Lys Ser Phe Thr Asn Thr Ile Gln Asp IleArg Arg 320 325 330 Arg Glu Arg Lys Leu Leu Glu Lys Ala Gly Phe Val GlnSer Gly 335 340 345 Asn Ser Ala Leu Ala Pro Ile Val Ser Thr Ile Ala IleVal Leu 350 355 360 Thr Leu Ser Cys His Ile Leu Leu Arg Arg Lys Leu ThrAla Pro 365 370 375 Val Ala Phe Ser Val Ile Ala Met Phe Asn Val Met LysPhe Ser 380 385 390 Ile Ala Ile Leu Pro Phe Ser Ile Lys Ala Met Ala GluAla Asn 395 400 405 Val Ser Leu Arg Arg Met Lys Lys Ile Leu Ile Asp LysSer Pro 410 415 420 Pro Ser Tyr Ile Thr Gln Pro Glu Asp Pro Asp Thr ValLeu Leu 425 430 435 Leu Ala Asn Ala Thr Leu Thr Trp Glu His Glu Ala SerArg Lys 440 445 450 Ser Thr Pro Lys Lys Leu Gln Asn Gln Lys Arg His LeuCys Lys 455 460 465 Lys Gln Arg Ser Glu Ala Tyr Ser Glu Arg Ser Pro ProAla Lys 470 475 480 Gly Ala Thr Gly Pro Glu Glu Gln Ser Asp Ser Leu LysSer Val 485 490 495 Leu His Ser Ile Ser Phe Val Val Arg Lys Gly Lys IleLeu Gly 500 505 510 Ile Cys Gly Asn Val Gly Ser Gly Lys Ser Ser Leu LeuAla Ala 515 520 525 Leu Leu Gly Gln Met Gln Leu Gln Lys Gly Val Val AlaVal Asn 530 535 540 Gly Thr Leu Ala Tyr Val Ser Gln Gln Ala Trp Ile PheHis Gly 545 550 555 Asn Val Arg Glu Asn Ile Leu Phe Gly Glu Lys Tyr AspHis Gln 560 565 570 Arg Tyr Gln His Thr Val Arg Val Cys Gly Leu Gln LysAsp Leu 575 580 585 Ser Asn Leu Pro Tyr Gly Asp Leu Thr Glu Ile Gly GluArg Gly 590 595 600 Leu Asn Leu Ser Gly Gly Gln Arg Gln Arg Ile Ser LeuAla Arg 605 610 615 Ala Val Tyr Ser Asp Arg Gln Leu Tyr Leu Leu Asp AspPro Leu 620 625 630 Ser Ala Val Asp Ala His Val Gly Lys His Val Phe GluGlu Cys 635 640 645 Ile Lys Lys Thr Leu Arg Gly Lys Thr Val Val Leu ValThr His 650 655 660 Gln Leu Gln Phe Leu Glu Ser Cys Asp Glu Val Ile LeuLeu Glu 665 670 675 Asp Gly Glu Ile Cys Glu Lys Gly Thr His Lys Glu LeuMet Glu 680 685 690 Glu Arg Gly Arg Tyr Ala Lys Leu Ile His Asn Leu ArgGly Leu 695 700 705 Gln Phe Lys Asp Pro Glu His Leu Tyr Asn Ala Ala MetVal Glu 710 715 720 Ala Phe Lys Glu Ser Pro Ala Glu Arg Glu Glu Asp AlaGly Ile 725 730 735 Ile Val Leu Ala Pro Gly Asn Glu Lys Asp Glu Gly LysGlu Ser 740 745 750 Glu Thr Gly Ser Glu Phe Val Asp Thr Lys Gly Tyr LeuLeu Ser 755 760 765 Leu Phe Thr Val Phe Leu Phe Leu Leu Met Ile Gly SerAla Ala 770 775 780 Phe Ser Asn Trp Trp Leu Gly Leu Trp Leu Asp Lys GlySer Arg 785 790 795 Met Thr Cys Gly Pro Gln Gly Asn Arg Thr Met Cys GluVal Gly 800 805 810 Ala Val Leu Ala Asp Ile Gly Gln His Val Tyr Gln TrpVal Tyr 815 820 825 Thr Ala Ser Met Val Phe Met Leu Val Phe Gly Val ThrLys Gly 830 835 840 Phe Val Phe Thr Lys Thr Thr Leu Met Ala Ser Ser SerLeu His 845 850 855 Asp Thr Val Phe Asp Lys Ile Leu Lys Ser Pro Met SerPhe Phe 860 865 870 Asp Thr Thr Pro Thr Gly Arg Leu Met Asn Arg Phe SerLys Asp 875 880 885 Met Asp Glu Leu Asp Val Arg Leu Pro Phe His Ala GluAsn Phe 890 895 900 Leu Gln Gln Phe Phe Met Val Val Phe Ile Leu Val IleLeu Ala 905 910 915 Ala Val Phe Pro Ala Val Leu Leu Val Val Ala Ser LeuAla Val 920 925 930 Gly Phe Phe Ile Leu Leu Arg Ile Phe His Arg Gly ValGln Glu 935 940 945 Leu Lys Lys Val Glu Asn Val Ser Arg Ser Pro Trp PheThr His 950 955 960 Ile Thr Ser Ser Met Gln Gly Leu Gly Ile Ile His AlaTyr Gly 965 970 975 Lys Lys Glu Ser Cys Ile Thr Tyr His Leu Leu Tyr PheAsn Cys 980 985 990 Ala Leu Arg Trp Phe Ala Leu Arg Met Asp Val Leu MetAsn Ile 995 1000 1005 Leu Thr Phe Thr Val Ala Leu Leu Val Thr Leu SerPhe Ser Ser 1010 1015 1020 Ile Ser Thr Ser Ser Lys Gly Leu Ser Leu SerTyr Ile Ile Gln 1025 1030 1035 Leu Ser Gly Leu Leu Gln Val Cys Val ArgThr Gly Thr Glu Thr 1040 1045 1050 Gln Ala Lys Phe Thr Ser Val Glu LeuLeu Arg Glu Tyr Ile Ser 1055 1060 1065 Thr Cys Val Pro Glu Cys Thr HisPro Leu Lys Val Gly Thr Cys 1070 1075 1080 Pro Lys Asp Trp Pro Ser CysGly Glu Ile Thr Phe Arg Asp Tyr 1085 1090 1095 Gln Met Arg Tyr Arg AspAsn Thr Pro Leu Val Leu Asp Ser Leu 1100 1105 1110 Asn Leu Asn Ile GlnSer Gly Gln Thr Val Gly Ile Val Gly Arg 1115 1120 1125 Thr Gly Ser GlyLys Ser Ser Leu Gly Met Ala Leu Phe Arg Leu 1130 1135 1140 Val Glu ProAla Ser Gly Thr Ile Phe Ile Asp Glu Val Asp Ile 1145 1150 1155 Cys IleLeu Ser Leu Glu Asp Leu Arg Thr Lys Leu Thr Val Ile 1160 1165 1170 ProGln Asp Pro Val Leu Phe Val Gly Thr Val Arg Tyr Asn Leu 1175 1180 1185Asp Pro Phe Glu Ser His Thr Asp Glu Met Leu Trp Gln Val Leu 1190 11951200 Glu Arg Thr Phe Met Arg Asp Thr Ile Met Lys Leu Pro Glu Lys 12051210 1215 Leu Gln Ala Glu Val Thr Glu Asn Gly Glu Asn Phe Ser Val Gly1220 1225 1230 Glu Arg Gln Leu Leu Cys Val Ala Arg Ala Leu Leu Arg AsnSer 1235 1240 1245 Lys Ile Ile Leu Leu Asp Glu Ala Thr Ala Ser Met AspSer Lys 1250 1255 1260 Thr Asp Thr Leu Val Gln Asn Thr Ile Lys Asp AlaPhe Lys Gly 1265 1270 1275 Cys Thr Val Leu Thr Ile Ala His Arg Leu AsnThr Val Leu Asn 1280 1285 1290 Cys Asp His Val Leu Val Met Glu Asn GlyLys Val Ile Glu Phe 1295 1300 1305 Asp Lys Pro Glu Val Leu Ala Glu LysPro Asp Ser Ala Phe Ala 1310 1315 1320 Met Leu Leu Ala Ala Glu Val ArgLeu 1325 13 1353 PRT Homo sapiens misc_feature Incyte ID No 4921451CD113 Met Gly Thr Gly Pro Ala Gln Thr Pro Arg Ser Thr Arg Ala Gly 1 5 10 15Pro Glu Pro Ser Pro Ala Pro Pro Gly Pro Gly Asp Thr Gly Asp 20 25 30 SerAsp Val Thr Gln Glu Gly Ser Gly Pro Ala Gly Ile Arg Gly 35 40 45 Ala ProPro Ala Trp Ala Ala Ser Ala Arg Glu Lys Ile Ser Glu 50 55 60 Met Arg ThrGly Thr Gln Val Leu Ile Leu Gly Gly Gly Gly Gly 65 70 75 Ala Ala Phe ThrTrp Lys Val Gln Ala Asn Asn Arg Ala Tyr Asn 80 85 90 Gly Gln Phe Lys GluLys Val Ile Leu Cys Trp Gln Arg Lys Lys 95 100 105 Tyr Lys Thr Asn ValIle Arg Thr Ala Lys Tyr Asn Phe Tyr Ser 110 115 120 Phe Leu Pro Leu AsnLeu Tyr Glu Gln Phe His Arg Val Ser Asn 125 130 135 Leu Phe Phe Leu IleIle Ile Ile Leu Gln Ser Ile Pro Asp Ile 140 145 150 Ser Thr Leu Pro TrpPhe Ser Leu Ser Thr Pro Met Val Cys Leu 155 160 165 Leu Phe Ile Arg AlaThr Arg Asp Leu Val Asp Asp Met Gly Arg 170 175 180 His Lys Ser Asp ArgAla Ile Asn Asn Arg Pro Cys Gln Ile Leu 185 190 195 Met Gly Lys Ser PheLys Gln Lys Lys Trp Gln Asp Leu Cys Val 200 205 210 Gly Asp Val Val CysLeu Arg Lys Asp Asn Ile Val Pro Val Ser 215 220 225 Trp Gly Gly Pro ArgGly Pro Arg Thr Thr Arg Pro Leu Thr Glu 230 235 240 Ser Thr Pro Pro ArgVal Gly Arg Ala Ala Ala Pro Pro Ile Cys 245 250 255 Leu Ala Ser Pro LeuAla Thr Leu Pro Pro Thr Pro His Gln Ala 260 265 270 Asp Met Leu Leu LeuAla Ser Thr Glu Pro Ser Ser Leu Cys Tyr 275 280 285 Val Glu Thr Val AspIle Asp Gly Glu Thr Asn Leu Lys Phe Arg 290 295 300 Gln Ala Leu Met ValThr His Lys Glu Leu Ala Thr Ile Lys Lys 305 310 315 Met Ala Ser Phe GlnGly Thr Val Thr Cys Glu Ala Pro Asn Ser 320 325 330 Arg Met His His PheVal Gly Cys Leu Glu Trp Asn Asp Lys Lys 335 340 345 Tyr Ser Leu Asp IleGly Asn Leu Leu Leu Arg Gly Cys Arg Ile 350 355 360 Arg Asn Thr Asp ThrCys Tyr Gly Leu Val Ile Tyr Ala Gly Phe 365 370 375 Asp Thr Lys Ile MetLys Asn Cys Gly Lys Ile His Leu Lys Arg 380 385 390 Thr Lys Leu Asp LeuLeu Met Asn Lys Leu Val Val Val Ile Phe 395 400 405 Ile Ser Val Val LeuVal Cys Leu Val Leu Ala Phe Gly Phe Gly 410 415 420 Phe Ser Val Lys GluPhe Lys Asp His His Tyr Tyr Leu Ser Gly 425 430 435 Val His Gly Ser SerVal Ala Ala Glu Ser Phe Phe Val Phe Trp 440 445 450 Ser Phe Leu Ile LeuLeu Ser Val Thr Ile Pro Met Ser Met Phe 455 460 465 Ile Leu Ser Glu PheIle Tyr Leu Gly Asn Ser Val Phe Ile Asp 470 475 480 Trp Asp Val Gln MetTyr Tyr Lys Pro Gln Asp Val Pro Ala Lys 485 490 495 Ala Arg Ser Thr SerLeu Asn Asp His Leu Gly Gln Val Glu Tyr 500 505 510 Ile Phe Ser Asp LysThr Gly Thr Leu Thr Gln Asn Ile Leu Thr 515 520 525 Phe Asn Lys Cys CysIle Ser Gly Arg Val Tyr Gly Glu Pro Leu 530 535 540 Pro Leu Glu Gln ValArg Arg Arg Glu Ala Ala Leu Pro Gln Cys 545 550 555 Gly Pro Ala Ala ProArg Ala Asp Gln Arg Gly Arg Gly Arg Ala 560 565 570 Gly Val Leu Ala ProAla Gly His Leu Pro His Gly Asp Asp Gln 575 580 585 Leu Leu Tyr Gln AlaAla Ser Pro Asp Glu Gly Ala Leu Val Thr 590 595 600 Ala Ala Arg Asn PheGly Tyr Val Phe Leu Ser Arg Thr Gln Asp 605 610 615 Thr Val Thr Ile MetGlu Leu Gly Glu Glu Arg Val Tyr Gln Val 620 625 630 Leu Ala Ile Met AspPhe Asn Ser Thr Arg Lys Arg Met Ser Val 635 640 645 Leu Val Arg Lys ProGlu Gly Ala Ile Cys Leu Tyr Thr Lys Gly 650 655 660 Ala Asp Thr Val IlePhe Glu Arg Leu His Arg Arg Gly Ala Met 665 670 675 Glu Phe Ala Thr GluGlu Ala Leu Ala Ala Phe Ala Gln Glu Thr 680 685 690 Leu Arg Thr Leu CysLeu Ala Tyr Arg Glu Val Ala Glu Asp Ile 695 700 705 Tyr Glu Asp Trp GlnGln Arg His Gln Glu Ala Ser Leu Leu Leu 710 715 720 Gln Asn Arg Ala GlnAla Leu Gln Gln Val Tyr Asn Glu Met Glu 725 730 735 Gln Asp Leu Arg LeuLeu Gly Ala Thr Ala Ile Glu Asp Arg Leu 740 745 750 Gln Asp Gly Val ProGlu Thr Ile Lys Cys Leu Lys Lys Ser Asn 755 760 765 Ile Lys Ile Trp ValLeu Thr Gly Asp Lys Gln Glu Thr Ala Val 770 775 780 Asn Ile Gly Phe AlaCys Glu Leu Leu Ser Glu Asn Met Leu Ile 785 790 795 Leu Glu Glu Lys GluIle Ser Arg Ile Leu Glu Thr Tyr Trp Glu 800 805 810 Asn Ser Asn Asn LeuLeu Thr Arg Glu Ser Leu Ser Gln Val Lys 815 820 825 Leu Ala Leu Val IleAsn Gly Asp Phe Leu Asp Lys Leu Leu Val 830 835 840 Ser Leu Arg Lys GluPro Arg Ala Leu Ala Gln Asn Val Asn Met 845 850 855 Asp Glu Ala Trp GlnGlu Leu Gly Gln Ser Arg Arg Asp Phe Leu 860 865 870 Tyr Ala Arg Arg LeuSer Leu Leu Cys Arg Arg Phe Gly Leu Pro 875 880 885 Leu Ala Ala Pro ProAla Gln Asp Ser Arg Ala Arg Arg Ser Ser 890 895 900 Glu Val Leu Gln GluArg Ala Phe Val Asp Leu Ala Ser Lys Cys 905 910 915 Gln Ala Val Ile CysCys Arg Val Thr Pro Lys Gln Lys Ala Leu 920 925 930 Ile Val Ala Leu ValLys Lys Tyr His Gln Val Val Thr Leu Ala 935 940 945 Ile Gly Asp Gly AlaAsn Asp Ile Asn Met Ile Lys Thr Ala Asp 950 955 960 Val Gly Val Gly LeuAla Gly Gln Glu Gly Met Gln Ala Val Gln 965 970 975 Asn Ser Asp Phe ValLeu Gly Gln Phe Cys Phe Leu Gln Arg Leu 980 985 990 Leu Leu Val His GlyArg Trp Ser Tyr Val Arg Ile Cys Lys Phe 995 1000 1005 Leu Arg Tyr PhePhe Tyr Lys Ser Met Ala Ser Met Met Val Gln 1010 1015 1020 Val Trp PheAla Cys Tyr Asn Gly Phe Thr Gly Gln Asp Val Ser 1025 1030 1035 Ala GluGln Ser Leu Glu Lys Pro Glu Leu Tyr Val Val Gly Gln 1040 1045 1050 LysAsp Glu Leu Phe Asn Tyr Trp Val Phe Val Gln Ala Ile Ala 1055 1060 1065His Gly Val Thr Thr Ser Leu Val Asn Phe Phe Met Thr Leu Trp 1070 10751080 Ile Ser Arg Asp Thr Ala Gly Pro Ala Ser Phe Ser Asp His Gln 10851090 1095 Ser Phe Ala Val Val Val Ala Leu Ser Cys Leu Leu Ser Ile Thr1100 1105 1110 Met Glu Val Ile Leu Ile Ile Lys Tyr Trp Thr Ala Leu CysVal 1115 1120 1125 Ala Thr Ile Leu Leu Ser Leu Gly Phe Tyr Ala Ile MetThr Thr 1130 1135 1140 Thr Thr Gln Ser Phe Trp Leu Phe Arg Val Ser ProThr Thr Phe 1145 1150 1155 Pro Phe Leu Tyr Ala Asp Leu Ser Val Met SerSer Pro Ser Ile 1160 1165 1170 Leu Leu Val Val Leu Leu Ser Val Ser IleAsn Thr Phe Pro Val 1175 1180 1185 Leu Ala Leu Arg Val Ile Phe Pro AlaLeu Lys Glu Leu Arg Ala 1190 1195 1200 Lys Glu Glu Lys Val Glu Glu GlyPro Ser Glu Glu Ile Phe Thr 1205 1210 1215 Met Glu Pro Leu Pro His ValHis Arg Glu Ser Arg Ala Arg Arg 1220 1225 1230 Ser Ser Tyr Ala Phe SerHis Arg Gln Leu Thr Leu Glu Ser Gln 1235 1240 1245 Pro Asp Ser Ser GluGlu Lys Ser Ala Phe Leu Lys Pro Ser Thr 1250 1255 1260 Pro Phe Arg LysSer Trp Gln Lys Glu Pro His Thr Pro Lys Glu 1265 1270 1275 Gly Thr ValPro Leu Pro Asp Lys Thr His Lys Ser Gln Val Glu 1280 1285 1290 Thr LeuPro Pro Ser Leu Glu Glu Ser Ser Thr Ser Thr Ser Glu 1295 1300 1305 GlnPro Met Glu Val Glu Leu Trp Pro Ala Glu Lys Gln Ser Ser 1310 1315 1320Ser Ser Met Glu Trp Leu Leu Val Pro Gly Glu Glu Gln Leu Ser 1325 13301335 Leu Pro Pro Glu Glu Gln Ser Leu Pro Ser Ala Glu Gly Thr Arg 13401345 1350 Val Gln Gln 14 921 PRT Homo sapiens misc_feature Incyte ID No5547443CD1 14 Met Ala His Glu Ser Ala Glu Asp Leu Phe His Phe Asn ValGly 1 5 10 15 Gly Trp His Phe Ser Val Pro Arg Ser Lys Leu Ser Gln PhePro 20 25 30 Asp Ser Leu Leu Trp Lys Glu Ala Ser Ala Leu Thr Ser Ser Glu35 40 45 Ser Gln Arg Leu Phe Ile Asp Arg Asp Gly Ser Thr Phe Arg His 5055 60 Val His Tyr Tyr Leu Tyr Thr Ser Lys Leu Ser Phe Ser Ser Cys 65 7075 Ala Glu Leu Asn Leu Leu Tyr Glu Gln Ala Leu Gly Leu Gln Leu 80 85 90Met Pro Leu Leu Gln Thr Leu Asp Asn Leu Lys Glu Gly Lys His 95 100 105His Leu Arg Val Arg Pro Ala Asp Leu Pro Val Ala Glu Arg Ala 110 115 120Ser Leu Asn Tyr Trp Arg Thr Trp Lys Cys Ile Ser Lys Pro Ser 125 130 135Glu Phe Pro Ile Lys Ser Pro Ala Phe Thr Gly Leu His Asp Lys 140 145 150Ala Pro Leu Gly Leu Met Asp Thr Pro Leu Leu Asp Thr Glu Glu 155 160 165Glu Val His Tyr Cys Phe Leu Pro Leu Asp Leu Val Ala Lys Tyr 170 175 180Pro Ser Leu Val Thr Glu Asp Asn Leu Leu Trp Leu Ala Glu Thr 185 190 195Val Ala Leu Ile Glu Cys Glu Cys Ser Glu Phe Arg Phe Ile Val 200 205 210Asn Phe Leu Arg Ser Gln Lys Ile Leu Leu Pro Asp Asn Phe Ser 215 220 225Asn Ile Asp Val Leu Glu Ala Glu Val Glu Ile Leu Glu Ile Pro 230 235 240Ala Leu Thr Glu Ala Val Arg Trp Tyr Arg Met Asn Met Gly Gly 245 250 255Cys Ser Pro Thr Thr Cys Ser Pro Leu Ser Pro Gly Lys Gly Ala 260 265 270Arg Thr Ala Ser Leu Glu Ser Val Lys Pro Leu Tyr Thr Met Ala 275 280 285Leu Gly Leu Leu Val Lys Tyr Pro Asp Ser Ala Leu Gly Gln Leu 290 295 300Arg Ile Glu Ser Thr Leu Asp Gly Ser Arg Leu Tyr Ile Thr Gly 305 310 315Asn Gly Val Leu Phe Gln His Val Lys Asn Trp Leu Gly Thr Cys 320 325 330Arg Leu Pro Leu Thr Glu Thr Ile Ser Glu Val Tyr Glu Leu Cys 335 340 345Ala Phe Leu Asp Lys Arg Asp Ile Thr Tyr Glu Pro Ile Lys Val 350 355 360Ala Leu Lys Thr His Leu Glu Pro Arg Thr Leu Ala Pro Met Asp 365 370 375Val Leu Asn Glu Trp Thr Ala Glu Ile Thr Val Tyr Ser Pro Gln 380 385 390Gln Ile Ile Lys Val Tyr Val Gly Ser His Trp Tyr Ala Thr Thr 395 400 405Leu Gln Thr Leu Leu Lys Tyr Pro Glu Leu Leu Ser Asn Pro Gln 410 415 420Arg Val Tyr Trp Ile Thr Tyr Gly Gln Thr Leu Leu Ile His Gly 425 430 435Asp Gly Gln Met Phe Arg His Ile Leu Asn Phe Leu Arg Leu Gly 440 445 450Lys Leu Phe Leu Pro Ser Glu Phe Lys Glu Trp Pro Leu Phe Cys 455 460 465Gln Glu Val Glu Glu Tyr His Ile Pro Ser Leu Ser Glu Ala Leu 470 475 480Ala Gln Cys Glu Ala Tyr Lys Ser Trp Thr Gln Glu Lys Glu Ser 485 490 495Glu Asn Glu Glu Ala Phe Ser Ile Arg Arg Leu His Val Val Thr 500 505 510Glu Gly Pro Gly Ser Leu Val Glu Phe Ser Arg Asp Thr Lys Glu 515 520 525Thr Thr Ala Tyr Met Pro Val Asp Phe Glu Asp Cys Ser Asp Arg 530 535 540Thr Pro Trp Asn Lys Ala Lys Gly Asn Leu Val Arg Ser Asn Gln 545 550 555Met Asp Glu Ala Glu Gln Tyr Thr Arg Pro Ile Gln Val Ser Leu 560 565 570Cys Arg Asn Ala Lys Arg Ala Gly Asn Pro Ser Thr Tyr Ser His 575 580 585Cys Arg Gly Leu Cys Thr Asn Pro Gly His Trp Gly Ser His Pro 590 595 600Glu Ser Pro Pro Lys Lys Lys Cys Thr Thr Ile Asn Leu Thr Gln 605 610 615Lys Ser Glu Thr Lys Asp Pro Pro Ala Thr Pro Met Gln Lys Leu 620 625 630Ile Ser Leu Val Arg Glu Trp Asp Met Val Asn Cys Lys Gln Trp 635 640 645Glu Phe Gln Pro Leu Thr Ala Thr Arg Ser Ser Pro Leu Glu Glu 650 655 660Ala Thr Leu Gln Leu Pro Leu Gly Ser Glu Ala Ala Ser Gln Pro 665 670 675Ser Thr Ser Ala Ala Trp Lys Ala His Ser Thr Ala Ser Glu Lys 680 685 690Asp Pro Gly Pro Gln Ala Gly Ala Gly Ala Gly Ala Lys Asp Lys 695 700 705Gly Pro Glu Pro Thr Phe Lys Pro Tyr Leu Pro Pro Lys Arg Ala 710 715 720Gly Thr Leu Lys Asp Trp Ser Lys Gln Arg Thr Lys Glu Arg Glu 725 730 735Ser Pro Ala Pro Glu Gln Pro Leu Pro Glu Ala Ser Glu Val Asp 740 745 750Ser Leu Gly Val Ile Leu Lys Val Thr His Pro Pro Val Val Gly 755 760 765Ser Asp Gly Phe Cys Met Phe Phe Glu Asp Ser Ile Ile Tyr Thr 770 775 780Thr Glu Met Asp Asn Leu Arg His Thr Thr Pro Thr Ala Ser Pro 785 790 795Gln Pro Gln Glu Val Thr Phe Leu Ser Phe Ser Leu Ser Trp Glu 800 805 810Glu Met Phe Tyr Ala Gln Lys Cys His Cys Phe Leu Ala Asp Ile 815 820 825Ile Met Asp Ser Ile Arg Gln Lys Asp Pro Lys Ala Ile Thr Ala 830 835 840Lys Val Val Ser Leu Ala Asn Arg Leu Trp Thr Leu His Ile Ser 845 850 855Pro Lys Gln Phe Val Val Asp Leu Leu Ala Ile Thr Gly Phe Lys 860 865 870Asp Asp Arg His Thr Gln Glu Arg Leu Tyr Ser Trp Val Glu Leu 875 880 885Thr Leu Pro Phe Ala Arg Lys Tyr Gly Arg Cys Met Asp Leu Leu 890 895 900Ile Gln Arg Gly Leu Ser Arg Ser Val Ser Tyr Ser Ile Leu Gly 905 910 915Lys Tyr Leu Gln Glu Asp 920 15 530 PRT Homo sapiens misc_feature IncyteID No 56008413CD1 15 Met Gly Ser Val Gly Ser Gln Arg Leu Glu Glu Pro SerVal Ala 1 5 10 15 Gly Thr Pro Asp Pro Gly Val Val Met Ser Phe Thr PheAsp Ser 20 25 30 His Gln Leu Glu Glu Ala Ala Glu Ala Ala Gln Gly Gln GlyLeu 35 40 45 Arg Ala Arg Gly Val Pro Ala Phe Thr Asp Thr Thr Leu Asp Glu50 55 60 Pro Val Pro Asp Asp Arg Tyr His Ala Ile Tyr Phe Ala Met Leu 6570 75 Leu Ala Gly Val Gly Phe Leu Leu Pro Tyr Asn Ser Phe Ile Thr 80 8590 Asp Val Asp Tyr Leu His His Lys Tyr Pro Gly Thr Ser Ile Val 95 100105 Phe Asp Met Ser Leu Thr Tyr Ile Leu Val Ala Leu Ala Ala Val 110 115120 Leu Leu Asn Asn Val Leu Val Glu Arg Leu Thr Leu His Thr Arg 125 130135 Ile Thr Ala Gly Tyr Leu Leu Ala Leu Gly Pro Leu Leu Phe Ile 140 145150 Ser Ile Cys Asp Val Trp Leu Gln Leu Phe Ser Arg Asp Gln Ala 155 160165 Tyr Ala Ile Asn Leu Ala Ala Val Gly Thr Val Ala Phe Gly Cys 170 175180 Thr Val Gln Gln Ser Ser Phe Tyr Gly Tyr Thr Gly Met Leu Pro 185 190195 Lys Arg Tyr Thr Gln Gly Val Met Thr Gly Glu Ser Thr Ala Gly 200 205210 Val Met Ile Ser Leu Ser Arg Ile Leu Thr Lys Leu Leu Leu Pro 215 220225 Asp Glu Arg Ala Ser Thr Leu Ile Phe Phe Leu Val Ser Val Ala 230 235240 Leu Glu Leu Leu Cys Phe Leu Leu His Leu Leu Val Arg Arg Ser 245 250255 Arg Phe Val Leu Phe Tyr Thr Thr Arg Pro Arg Asp Ser His Arg 260 265270 Gly Arg Pro Gly Leu Gly Arg Gly Tyr Gly Tyr Arg Val His His 275 280285 Asp Val Val Ala Gly Asp Val His Phe Glu His Pro Ala Pro Ala 290 295300 Leu Ala Pro Asn Glu Ser Pro Lys Asp Ser Pro Ala His Glu Val 305 310315 Thr Gly Ser Gly Gly Ala Tyr Met Arg Phe Asp Val Pro Arg Pro 320 325330 Arg Val Gln Arg Ser Trp Pro Thr Phe Arg Ala Leu Leu Leu His 335 340345 Arg Tyr Val Val Ala Arg Val Ile Trp Ala Asp Met Leu Ser Ile 350 355360 Ala Val Thr Tyr Phe Ile Thr Leu Cys Leu Phe Pro Gly Leu Glu 365 370375 Ser Glu Ile Arg His Cys Ile Leu Gly Glu Trp Leu Pro Ile Leu 380 385390 Ile Met Ala Val Phe Asn Leu Ser Asp Phe Val Gly Lys Ile Leu 395 400405 Ala Ala Leu Pro Val Asp Trp Arg Gly Thr His Leu Leu Ala Cys 410 415420 Ser Cys Leu Arg Val Val Phe Ile Pro Leu Phe Ile Leu Cys Val 425 430435 Tyr Pro Ser Gly Met Pro Ala Leu Arg His Pro Ala Trp Pro Cys 440 445450 Ile Phe Ser Leu Leu Met Gly Ile Ser Asn Gly Tyr Phe Gly Ser 455 460465 Val Pro Met Ile Leu Ala Ala Gly Lys Val Ser Pro Lys Gln Arg 470 475480 Glu Leu Ala Gly Asn Thr Met Thr Val Ser Tyr Met Ser Gly Leu 485 490495 Thr Leu Gly Ser Ala Val Ala Tyr Cys Thr Tyr Ser Leu Thr Arg 500 505510 Asp Ala His Gly Ser Cys Leu His Ala Ser Thr Ala Asn Gly Ser 515 520525 Ile Leu Ala Gly Leu 530 16 1617 PRT Homo sapiens misc_feature IncyteID No 6127911CD1 16 Met Asn Met Lys Gln Lys Ser Val Tyr Gln Gln Thr LysAla Leu 1 5 10 15 Leu Cys Lys Asn Phe Leu Lys Lys Trp Arg Met Lys ArgGlu Ser 20 25 30 Leu Leu Glu Trp Gly Leu Ser Ile Leu Leu Gly Leu Cys IleAla 35 40 45 Leu Phe Ser Ser Ser Met Arg Asn Val Gln Phe Pro Gly Met Ala50 55 60 Pro Gln Asn Leu Gly Arg Val Asp Lys Phe Asn Ser Ser Ser Leu 6570 75 Met Val Val Tyr Thr Pro Ile Ser Asn Leu Thr Gln Gln Ile Met 80 8590 Asn Lys Thr Ala Leu Ala Pro Leu Leu Lys Gly Thr Ser Val Ile 95 100105 Gly Ala Pro Asn Lys Thr His Met Asp Glu Ile Leu Leu Glu Asn 110 115120 Leu Pro Tyr Ala Met Gly Ile Ile Phe Asn Glu Thr Phe Ser Tyr 125 130135 Lys Leu Ile Phe Phe Gln Gly Tyr Asn Ser Pro Leu Trp Lys Glu 140 145150 Asp Phe Ser Ala His Cys Trp Asp Gly Tyr Gly Glu Phe Ser Cys 155 160165 Thr Leu Thr Lys Tyr Trp Asn Arg Gly Phe Val Ala Leu Gln Thr 170 175180 Ala Ile Asn Thr Ala Ile Ile Glu Ile Thr Thr Asn His Pro Val 185 190195 Met Glu Glu Leu Met Ser Val Thr Ala Ile Thr Met Lys Thr Leu 200 205210 Pro Phe Ile Thr Lys Asn Leu Leu His Asn Glu Met Phe Ile Leu 215 220225 Phe Phe Leu Leu His Phe Ser Pro Leu Val Tyr Phe Ile Ser Leu 230 235240 Asn Val Thr Lys Glu Arg Lys Lys Ser Lys Asn Leu Met Lys Met 245 250255 Met Gly Leu Gln Asp Ser Ala Phe Trp Leu Ser Trp Gly Leu Ile 260 265270 Tyr Ala Gly Phe Ile Phe Ile Ile Ser Ile Phe Ile Thr Ile Ile 275 280285 Ile Thr Phe Thr Gln Ile Ile Val Met Thr Gly Phe Met Val Ile 290 295300 Phe Ile Leu Phe Phe Leu Tyr Gly Leu Ser Leu Val Ala Leu Val 305 310315 Phe Leu Met Ser Val Leu Leu Lys Lys Ala Val Leu Thr Asn Leu 320 325330 Val Val Phe Leu Leu Thr Leu Phe Trp Gly Cys Leu Gly Phe Thr 335 340345 Val Phe Tyr Glu Gln Leu Pro Ser Ser Leu Glu Trp Ile Leu Asn 350 355360 Ile Cys Ser Pro Phe Ala Phe Thr Thr Gly Met Ile Gln Ile Ile 365 370375 Lys Leu Asp Tyr Asn Leu Asn Gly Val Ile Phe Pro Asp Pro Ser 380 385390 Gly Asp Ser Tyr Thr Met Ile Ala Thr Phe Ser Met Leu Leu Leu 395 400405 Asp Gly Leu Ile Tyr Leu Leu Leu Ala Leu Tyr Phe Asp Lys Ile 410 415420 Leu Pro Tyr Gly Asp Glu Arg His Tyr Ser Pro Leu Phe Phe Leu 425 430435 Asn Ser Ser Ser Cys Phe Gln His Gln Arg Thr Asn Ala Lys Val 440 445450 Ile Glu Lys Glu Ile Asp Ala Glu His Pro Ser Asp Asp Tyr Phe 455 460465 Glu Pro Val Ala Pro Glu Phe Gln Gly Lys Glu Ala Ile Arg Ile 470 475480 Arg Asn Val Lys Lys Glu Tyr Lys Gly Lys Ser Gly Lys Val Glu 485 490495 Ala Leu Lys Gly Leu Leu Phe Asp Ile Tyr Glu Gly Gln Ile Thr 500 505510 Ala Ile Leu Gly His Ser Gly Ala Gly Lys Ser Ser Leu Leu Asn 515 520525 Ile Leu Asn Gly Leu Ser Val Pro Thr Glu Gly Ser Val Thr Ile 530 535540 Tyr Asn Lys Asn Leu Ser Glu Met Gln Asp Leu Glu Glu Ile Arg 545 550555 Lys Ile Thr Gly Val Cys Pro Gln Phe Asn Val Gln Phe Asp Ile 560 565570 Leu Thr Val Lys Glu Asn Leu Ser Leu Phe Ala Lys Ile Lys Gly 575 580585 Ile His Leu Lys Glu Val Glu Gln Glu Val Gln Arg Ile Leu Leu 590 595600 Glu Leu Asp Met Gln Asn Ile Gln Asp Asn Leu Ala Lys His Leu 605 610615 Ser Glu Gly Gln Lys Arg Lys Leu Thr Phe Gly Ile Thr Ile Leu 620 625630 Gly Asp Pro Gln Ile Leu Leu Leu Asp Glu Pro Thr Thr Gly Leu 635 640645 Asp Pro Phe Ser Arg Asp Gln Val Trp Ser Leu Leu Arg Glu Arg 650 655660 Arg Ala Asp His Val Ile Leu Phe Ser Thr Gln Ser Met Asp Glu 665 670675 Ala Asp Ile Leu Ala Asp Arg Lys Val Ile Met Ser Asn Gly Arg 680 685690 Leu Lys Cys Ala Gly Ser Ser Met Phe Leu Lys Arg Arg Trp Gly 695 700705 Leu Gly Tyr His Leu Ser Leu His Arg Asn Glu Ile Cys Asn Pro 710 715720 Glu Gln Ile Thr Ser Phe Ile Thr His His Ile Pro Asp Ala Lys 725 730735 Leu Lys Thr Glu Asn Lys Glu Lys Leu Val Tyr Thr Leu Pro Leu 740 745750 Glu Arg Thr Asn Thr Phe Pro Asp Leu Phe Ser Asp Leu Asp Lys 755 760765 Cys Ser Asp Gln Gly Val Thr Gly Tyr Asp Ile Ser Met Ser Thr 770 775780 Leu Asn Glu Val Phe Met Lys Leu Glu Gly Gln Ser Thr Ile Glu 785 790795 Gln Asp Phe Glu Gln Val Glu Met Ile Arg Asp Ser Glu Ser Leu 800 805810 Asn Glu Met Glu Leu Ala His Ser Ser Phe Ser Glu Met Gln Thr 815 820825 Ala Val Ser Asp Met Gly Leu Trp Arg Met Gln Val Phe Ala Met 830 835840 Ala Arg Leu Arg Phe Leu Lys Leu Lys Arg Gln Thr Lys Val Leu 845 850855 Leu Thr Leu Leu Leu Val Phe Gly Ile Ala Ile Phe Pro Leu Ile 860 865870 Val Glu Asn Ile Ile Tyr Ala Met Leu Asn Glu Lys Ile Asp Trp 875 880885 Glu Phe Lys Asn Glu Leu Tyr Phe Leu Ser Pro Gly Gln Leu Pro 890 895900 Gln Glu Pro Arg Thr Ser Leu Leu Ile Ile Asn Asn Thr Glu Ser 905 910915 Asn Ile Glu Asp Phe Ile Lys Ser Leu Lys His Gln Asn Ile Leu 920 925930 Leu Glu Val Asp Asp Phe Glu Asn Arg Asn Gly Thr Asp Gly Leu 935 940945 Ser Tyr Asn Gly Ala Ile Ile Val Ser Gly Lys Gln Lys Asp Tyr 950 955960 Arg Phe Ser Val Val Cys Asn Thr Lys Arg Leu His Cys Phe Pro 965 970975 Ile Leu Met Asn Ile Ile Ser Asn Gly Leu Leu Gln Met Phe Asn 980 985990 His Thr Gln His Ile Arg Ile Glu Ser Ser Pro Phe Pro Leu Ser 995 10001005 His Ile Gly Leu Trp Thr Gly Leu Pro Asp Gly Ser Phe Phe Leu 10101015 1020 Phe Leu Val Leu Cys Ser Ile Ser Pro Tyr Ile Thr Met Gly Ser1025 1030 1035 Ile Ser Asp Tyr Lys Lys Asn Ala Lys Ser Gln Leu Trp IleSer 1040 1045 1050 Gly Leu Tyr Thr Ser Ala Tyr Trp Cys Gly Gln Ala LeuVal Asp 1055 1060 1065 Val Ser Phe Phe Ile Leu Ile Leu Leu Leu Met TyrLeu Ile Phe 1070 1075 1080 Tyr Ile Glu Asn Met Gln Tyr Leu Leu Ile ThrSer Gln Ile Val 1085 1090 1095 Phe Ala Leu Val Ile Val Thr Pro Gly TyrAla Ala Ser Leu Val 1100 1105 1110 Phe Phe Ile Tyr Met Ile Ser Phe IlePhe Arg Lys Arg Arg Lys 1115 1120 1125 Asn Ser Gly Leu Trp Ser Phe TyrPhe Phe Phe Ala Ser Thr Ile 1130 1135 1140 Met Phe Ser Ile Thr Leu IleAsn His Phe Asp Leu Ser Ile Leu 1145 1150 1155 Ile Thr Thr Met Val LeuVal Pro Ser Tyr Thr Leu Leu Gly Phe 1160 1165 1170 Lys Thr Phe Leu GluVal Arg Asp Gln Glu His Tyr Arg Glu Phe 1175 1180 1185 Pro Glu Ala AsnPhe Glu Leu Ser Ala Thr Asp Phe Leu Val Cys 1190 1195 1200 Phe Ile ProTyr Phe Gln Thr Leu Leu Phe Val Phe Val Leu Arg 1205 1210 1215 Cys MetGlu Leu Lys Cys Gly Lys Lys Arg Met Arg Lys Asp Pro 1220 1225 1230 ValPhe Arg Ile Ser Pro Gln Ser Arg Asp Ala Lys Pro Asn Pro 1235 1240 1245Glu Glu Pro Ile Asp Glu Asp Glu Asp Ile Gln Thr Glu Arg Ile 1250 12551260 Arg Thr Ala Thr Ala Leu Thr Thr Ser Ile Leu Asp Glu Lys Pro 12651270 1275 Val Ile Ile Ala Ser Cys Leu His Lys Glu Tyr Ala Gly Gln Lys1280 1285 1290 Lys Ser Cys Phe Ser Lys Arg Lys Lys Lys Ile Ala Ala ArgAsn 1295 1300 1305 Ile Ser Phe Cys Val Gln Glu Gly Glu Ile Leu Gly LeuLeu Gly 1310 1315 1320 Pro Ser Gly Ala Gly Lys Ser Ser Ser Ile Arg MetIle Ser Gly 1325 1330 1335 Ile Thr Lys Pro Thr Ala Gly Glu Val Glu LeuLys Gly Cys Ser 1340 1345 1350 Ser Val Leu Gly His Leu Gly Tyr Cys ProGln Glu Asn Val Leu 1355 1360 1365 Trp Pro Met Leu Thr Leu Arg Glu HisLeu Glu Val Tyr Ala Ala 1370 1375 1380 Val Lys Gly Leu Arg Lys Ala AspAla Arg Leu Ala Ile Ala Arg 1385 1390 1395 Leu Val Ser Ala Phe Lys LeuHis Glu Gln Leu Asn Val Pro Val 1400 1405 1410 Gln Lys Leu Thr Ala GlyIle Thr Arg Lys Leu Cys Phe Val Leu 1415 1420 1425 Ser Leu Leu Gly AsnSer Pro Val Leu Leu Leu Asp Glu Pro Ser 1430 1435 1440 Thr Gly Ile AspPro Thr Gly Gln Gln Gln Met Trp Gln Ala Ile 1445 1450 1455 Gln Ala ValVal Lys Asn Thr Glu Arg Gly Val Leu Leu Thr Thr 1460 1465 1470 His AsnLeu Ala Glu Ala Glu Ala Leu Cys Asp Arg Val Ala Ile 1475 1480 1485 MetVal Ser Gly Arg Leu Arg Cys Ile Gly Ser Ile Gln His Leu 1490 1495 1500Lys Asn Lys Leu Gly Lys Asp Tyr Ile Leu Glu Leu Lys Val Lys 1505 15101515 Glu Thr Ser Gln Val Thr Leu Val His Thr Glu Ile Leu Lys Leu 15201525 1530 Phe Pro Gln Ala Ala Gly Gln Glu Arg Tyr Ser Ser Leu Leu Thr1535 1540 1545 Tyr Lys Leu Pro Val Ala Asp Val Tyr Pro Leu Ser Gln ThrPhe 1550 1555 1560 His Lys Leu Glu Ala Val Lys His Asn Phe Asn Leu GluGlu Tyr 1565 1570 1575 Ser Leu Ser Gln Cys Thr Leu Glu Lys Val Phe LeuGlu Leu Ser 1580 1585 1590 Lys Glu Gln Glu Val Gly Asn Phe Asp Glu GluIle Asp Thr Thr 1595 1600 1605 Met Arg Trp Lys Leu Leu Pro His Ser AspGlu Pro 1610 1615 17 1192 PRT Homo sapiens misc_feature Incyte ID No6427133CD1 17 Met Phe Cys Ser Glu Lys Lys Leu Arg Glu Val Glu Arg IleVal 1 5 10 15 Lys Ala Asn Asp Arg Glu Tyr Asn Glu Lys Phe Gln Tyr AlaAsp 20 25 30 Asn Arg Ile His Thr Ser Lys Tyr Asn Ile Leu Thr Phe Leu Pro35 40 45 Ile Asn Leu Phe Glu Gln Phe Gln Arg Val Ala Asn Ala Tyr Phe 5055 60 Leu Cys Leu Leu Ile Leu Gln Leu Ile Pro Glu Ile Ser Ser Leu 65 7075 Thr Trp Phe Thr Thr Ile Val Pro Leu Val Leu Val Ile Thr Met 80 85 90Thr Ala Val Lys Asp Ala Thr Asp Asp Tyr Phe Arg His Lys Ser 95 100 105Asp Asn Gln Val Asn Asn Arg Gln Ser Glu Val Leu Ile Asn Ser 110 115 120Lys Leu Gln Asn Glu Lys Trp Met Asn Val Lys Val Gly Asp Ile 125 130 135Ile Lys Leu Glu Asn Asn Gln Phe Val Ala Ala Asp Leu Leu Leu 140 145 150Leu Ser Ser Ser Glu Pro His Gly Leu Cys Tyr Val Glu Thr Ala 155 160 165Glu Leu Asp Gly Glu Thr Asn Leu Lys Val Arg His Ala Leu Ser 170 175 180Val Thr Ser Glu Leu Gly Ala Asp Ile Ser Arg Leu Ala Gly Phe 185 190 195Asp Gly Ile Val Val Cys Glu Val Pro Asn Asn Lys Leu Asp Lys 200 205 210Phe Met Gly Ile Leu Ser Trp Lys Asp Ser Lys His Ser Leu Asn 215 220 225Asn Glu Lys Ile Ile Pro Arg Gly Cys Ile Leu Arg Asn Thr Ser 230 235 240Trp Cys Phe Gly Met Val Ile Phe Ala Gly Pro Asp Thr Lys Leu 245 250 255Met Gln Asn Ser Gly Lys Thr Lys Phe Lys Arg Thr Ser Ile Asp 260 265 270Arg Leu Met Asn Thr Leu Val Leu Trp Ile Phe Gly Phe Leu Ile 275 280 285Cys Leu Gly Ile Ile Leu Ala Ile Gly Asn Ser Ile Trp Glu Ser 290 295 300Gln Thr Gly Asp Gln Phe Arg Thr Phe Leu Phe Trp Asn Glu Gly 305 310 315Glu Lys Ser Ser Val Phe Ser Gly Phe Leu Thr Phe Trp Ser Tyr 320 325 330Ile Ile Ile Leu Asn Thr Val Val Pro Ile Ser Leu Tyr Val Ser 335 340 345Val Glu Val Ile Arg Leu Gly His Ser Tyr Phe Ile Asn Trp Asp 350 355 360Arg Lys Met Tyr Tyr Ser Arg Lys Ala Ile Pro Ala Val Ala Arg 365 370 375Thr Thr Thr Leu Asn Glu Glu Leu Gly Gln Ile Glu Tyr Ile Phe 380 385 390Ser Asp Lys Thr Gly Thr Leu Thr Gln Asn Ile Met Thr Phe Lys 395 400 405Arg Cys Ser Ile Asn Gly Arg Ile Tyr Gly Glu Val His Asp Asp 410 415 420Leu Asp Gln Lys Thr Glu Ile Thr Gln Glu Lys Glu Pro Val Asp 425 430 435Phe Ser Val Lys Ser Gln Ala Asp Arg Glu Phe Gln Phe Phe Asp 440 445 450His Asn Leu Met Glu Ser Ile Lys Met Gly Asp Pro Lys Val His 455 460 465Glu Phe Leu Arg Leu Leu Ala Leu Cys His Thr Val Met Ser Glu 470 475 480Glu Asn Ser Ala Gly Glu Leu Ile Tyr Gln Val Gln Ser Pro Asp 485 490 495Glu Gly Ala Leu Val Thr Ala Ala Arg Asn Phe Gly Phe Ile Phe 500 505 510Lys Ser Arg Thr Pro Glu Thr Ile Thr Ile Glu Glu Leu Gly Thr 515 520 525Leu Val Thr Tyr Gln Leu Leu Ala Phe Leu Asp Phe Asn Asn Thr 530 535 540Arg Lys Arg Met Ser Val Ile Val Arg Asn Pro Glu Gly Gln Ile 545 550 555Lys Leu Tyr Ser Lys Gly Ala Asp Thr Ile Leu Phe Glu Lys Leu 560 565 570His Pro Ser Asn Glu Val Leu Leu Ser Leu Thr Ser Asp His Leu 575 580 585Ser Glu Phe Ala Gly Glu Gly Leu Arg Thr Leu Ala Ile Ala Tyr 590 595 600Arg Asp Leu Asp Asp Lys Tyr Phe Lys Glu Trp His Lys Met Leu 605 610 615Glu Asp Ala Asn Ala Ala Thr Glu Glu Arg Asp Glu Arg Ile Ala 620 625 630Gly Leu Tyr Glu Glu Ile Glu Arg Asp Leu Met Leu Leu Gly Ala 635 640 645Thr Ala Val Glu Asp Lys Leu Gln Glu Gly Val Ile Glu Thr Val 650 655 660Thr Ser Leu Ser Leu Ala Asn Ile Lys Ile Trp Val Leu Thr Gly 665 670 675Asp Lys Gln Glu Thr Ala Ile Asn Ile Gly Tyr Ala Cys Asn Met 680 685 690Leu Thr Asp Asp Met Asn Asp Val Phe Val Ile Ala Gly Asn Asn 695 700 705Ala Val Glu Val Arg Glu Glu Leu Arg Lys Ala Lys Gln Asn Leu 710 715 720Phe Gly Gln Asn Arg Asn Phe Ser Asn Gly His Val Val Cys Glu 725 730 735Lys Lys Gln Gln Leu Glu Leu Asp Ser Ile Val Glu Glu Thr Ile 740 745 750Thr Gly Asp Tyr Ala Leu Ile Ile Asn Gly His Ser Leu Ala His 755 760 765Ala Leu Glu Ser Asp Val Lys Asn Asp Leu Leu Glu Leu Ala Cys 770 775 780Met Cys Lys Thr Val Ile Cys Cys Arg Val Thr Pro Leu Gln Lys 785 790 795Ala Gln Val Val Glu Leu Val Lys Lys Tyr Arg Asn Ala Val Thr 800 805 810Leu Ala Ile Gly Asp Gly Ala Asn Asp Val Ser Met Ile Lys Ser 815 820 825Ala His Ile Gly Val Gly Ile Ser Gly Gln Glu Gly Leu Gln Ala 830 835 840Val Leu Ala Ser Asp Tyr Ser Phe Ala Gln Phe Arg Tyr Leu Gln 845 850 855Arg Leu Leu Leu Val His Gly Arg Trp Ser Tyr Phe Arg Met Cys 860 865 870Lys Phe Leu Cys Tyr Phe Phe Tyr Lys Asn Phe Ala Phe Thr Leu 875 880 885Val His Phe Trp Phe Gly Phe Phe Cys Gly Phe Ser Ala Gln Thr 890 895 900Val Tyr Asp Gln Trp Phe Ile Thr Leu Phe Asn Ile Val Tyr Thr 905 910 915Ser Leu Pro Val Leu Ala Met Gly Ile Phe Asp Gln Asp Val Ser 920 925 930Asp Gln Asn Ser Val Asp Cys Pro Gln Leu Tyr Lys Pro Gly Gln 935 940 945Leu Asn Leu Leu Phe Asn Lys Arg Lys Phe Phe Ile Cys Val Leu 950 955 960His Gly Ile Tyr Thr Ser Leu Val Leu Phe Phe Ile Pro Tyr Gly 965 970 975Ala Phe Tyr Asn Val Ala Gly Glu Asp Gly Gln His Ile Ala Asp 980 985 990Tyr Gln Ser Phe Ala Val Thr Met Ala Thr Ser Leu Val Ile Val 995 10001005 Val Ser Val Gln Ile Ala Leu Asp Thr Ser Tyr Trp Thr Phe Ile 10101015 1020 Asn His Val Phe Ile Trp Gly Ser Ile Ala Ile Tyr Phe Ser Ile1025 1030 1035 Leu Phe Thr Met His Ser Asn Gly Ile Phe Gly Ile Phe ProAsn 1040 1045 1050 Gln Phe Pro Phe Val Gly Asn Ala Arg His Ser Leu ThrGln Lys 1055 1060 1065 Cys Ile Trp Leu Val Ile Leu Leu Thr Thr Val AlaSer Val Met 1070 1075 1080 Pro Val Val Ala Phe Arg Phe Leu Lys Val AspLeu Tyr Pro Thr 1085 1090 1095 Leu Ser Asp Gln Ile Arg Arg Trp Gln LysAla Gln Lys Lys Ala 1100 1105 1110 Arg Pro Pro Ser Ser Arg Arg Pro ArgThr Arg Arg Ser Ser Ser 1115 1120 1125 Arg Arg Ser Gly Tyr Ala Phe AlaHis Gln Glu Gly Tyr Gly Glu 1130 1135 1140 Leu Ile Thr Ser Gly Lys AsnMet Arg Ala Lys Asn Pro Pro Pro 1145 1150 1155 Thr Ser Gly Leu Glu LysThr His Tyr Asn Ser Thr Ser Trp Ile 1160 1165 1170 Glu Asn Leu Cys LysLys Thr Thr Asp Thr Val Ser Ser Phe Ser 1175 1180 1185 Gln Asp Lys ThrVal Lys Leu 1190 18 625 PRT Homo sapiens misc_feature Incyte ID No7472932CD1 18 Met Ala His Ala Pro Glu Pro Asp Pro Ala Ala Ser Asp LeuGly 1 5 10 15 Asp Glu Arg Pro Lys Trp Asp Asn Lys Ala Gln Tyr Leu LeuSer 20 25 30 Cys Ile Gly Phe Ala Val Gly Leu Gly Asn Ile Trp Arg Phe Pro35 40 45 Tyr Leu Cys Gln Thr Tyr Gly Gly Gly Ala Phe Leu Ile Pro Tyr 5055 60 Val Ile Ala Leu Val Phe Glu Gly Ile Pro Ile Phe His Val Glu 65 7075 Leu Ala Ile Gly Gln Arg Leu Arg Lys Gly Ser Val Gly Val Trp 80 85 90Thr Ala Ile Ser Pro Tyr Leu Ser Gly Val Gly Leu Gly Cys Val 95 100 105Thr Leu Ser Phe Leu Ile Ser Leu Tyr Tyr Asn Thr Ile Val Ala 110 115 120Trp Val Leu Trp Tyr Leu Leu Asn Ser Phe Gln His Pro Leu Pro 125 130 135Trp Ser Ser Cys Pro Pro Asp Leu Asn Arg Thr Gly Phe Val Glu 140 145 150Glu Cys Gln Gly Ser Ser Ala Val Ser Tyr Phe Trp Tyr Arg Gln 155 160 165Thr Leu Asn Ile Thr Ala Asp Ile Asn Asp Ser Gly Ser Ile Gln 170 175 180Trp Trp Leu Leu Ile Cys Leu Ala Ala Ser Trp Ala Val Val Tyr 185 190 195Met Cys Val Ile Arg Gly Ile Glu Thr Thr Gly Lys Val Ile Tyr 200 205 210Phe Thr Ala Leu Phe Pro Tyr Leu Val Leu Thr Ile Phe Leu Ile 215 220 225Arg Gly Leu Thr Leu Pro Gly Ala Thr Lys Gly Leu Ile Tyr Leu 230 235 240Phe Thr Pro Asn Met His Ile Leu Gln Asn Pro Arg Val Trp Leu 245 250 255Asp Ala Ala Thr Gln Ile Phe Phe Ser Leu Ser Leu Ala Phe Gly 260 265 270Gly His Ile Ala Phe Ala Ser Tyr Asn Ser Pro Arg Asn Asp Cys 275 280 285Gln Lys Asp Ala Val Val Ile Ala Leu Val Asn Arg Met Thr Ser 290 295 300Leu Tyr Ala Ser Ile Ala Val Phe Ser Val Leu Gly Phe Lys Ala 305 310 315Thr Asn Asp Cys Pro Arg Arg Asn Ile Leu Ser Leu Ile Asn Asp 320 325 330Phe Asp Phe Pro Glu Gln Ser Ile Ser Arg Asp Asp Tyr Pro Ala 335 340 345Val Leu Met His Leu Asn Ala Thr Trp Pro Lys Arg Val Ala Gln 350 355 360Leu Pro Leu Lys Ala Cys Leu Leu Glu Asp Phe Leu Asp Lys Ser 365 370 375Ala Ser Gly Pro Gly Leu Ala Phe Val Val Phe Thr Glu Thr Asp 380 385 390Leu His Met Pro Gly Ala Pro Val Trp Ala Met Leu Phe Phe Gly 395 400 405Met Leu Phe Thr Leu Gly Leu Ser Thr Met Phe Gly Thr Val Glu 410 415 420Ala Val Ile Thr Pro Leu Leu Asp Val Gly Val Leu Pro Arg Trp 425 430 435Val Pro Lys Glu Ala Leu Thr Gly Leu Val Cys Leu Val Cys Phe 440 445 450Leu Ser Ala Thr Cys Phe Thr Leu Gln Ser Gly Asn Tyr Trp Leu 455 460 465Glu Ile Phe Asp Asn Phe Ala Ala Ser Leu Asn Leu Leu Met Leu 470 475 480Ala Phe Leu Glu Val Val Gly Val Val Tyr Val Tyr Gly Met Lys 485 490 495Arg Phe Cys Asp Asp Ile Ala Trp Met Thr Gly Arg Arg Pro Ser 500 505 510Pro Tyr Trp Arg Leu Thr Trp Arg Val Val Ser Pro Leu Leu Leu 515 520 525Thr Ile Phe Val Ala Tyr Ile Ile Leu Leu Phe Trp Lys Pro Leu 530 535 540Arg Tyr Lys Ala Trp Asn Pro Lys Tyr Glu Leu Phe Pro Ser Arg 545 550 555Gln Glu Lys Leu Tyr Pro Gly Trp Ala Arg Ala Ala Cys Val Leu 560 565 570Leu Ser Leu Leu Pro Val Leu Trp Val Pro Val Ala Ala Leu Ala 575 580 585Gln Leu Leu Thr Arg Arg Arg Arg Thr Trp Arg Asp Arg Asp Ala 590 595 600Arg Pro Asp Thr Asp Met Arg Pro Asp Thr Asp Thr Arg Pro Asp 605 610 615Thr Asp Met Arg Pro Asp Thr Asp Met Arg 620 625 19 1181 PRT Homo sapiensmisc_feature Incyte ID No 8463147CD1 19 Met Thr Gln Ala Tyr Gln Lys TyrIle Leu Glu Lys Leu Pro Lys 1 5 10 15 Ser Pro Gly Asp Lys Gly Arg AlaTrp Pro Gly Ser Thr Pro Ser 20 25 30 Gly Asn Leu Leu Ser Pro Phe Met AlaAla Ser Asn Ser Phe Pro 35 40 45 Glu Leu Cys Ser Gln Val Ser Arg Arg GluTyr Trp Asp Leu His 50 55 60 Gly Ile Pro Ser Asp His Phe Ser Val Arg ValGln Val Glu Phe 65 70 75 Tyr Met Asn Glu Asn Thr Phe Lys Glu Arg Leu ThrLeu Phe Phe 80 85 90 Ile Thr Asn Gln Arg Ser Ser Leu Arg Ile Arg Leu PheAsn Phe 95 100 105 Ser Leu Lys Leu Leu Ser Cys Leu Leu Tyr Ile Ile ArgVal Leu 110 115 120 Leu Glu Asn Pro Ser Gln Gly Asn Glu Trp Ser His IlePhe Trp 125 130 135 Val Asn Arg Ser Leu Pro Leu Trp Gly Leu Gln Val SerVal Ala 140 145 150 Leu Ile Ser Leu Phe Glu Thr Ile Leu Leu Gly Tyr LeuSer Tyr 155 160 165 Lys Gly Asn Ile Trp Glu Gln Ile Leu Arg Ile Pro PheIle Leu 170 175 180 Glu Ile Ile Asn Ala Val Pro Phe Ile Ile Ser Ile PheTrp Pro 185 190 195 Ser Leu Arg Asn Leu Phe Val Pro Val Phe Leu Asn CysTrp Leu 200 205 210 Ala Lys His Ala Leu Glu Asn Met Ile Asn Asp Leu HisArg Ala 215 220 225 Ile Gln Arg Thr Gln Cys Cys Lys Cys Val Asn Gln ValLeu Ile 230 235 240 Val Ile Ser Thr Leu Leu Cys Leu Ile Phe Thr Cys IleCys Gly 245 250 255 Ile Gln His Leu Glu Arg Ile Gly Lys Lys Leu Asn LeuPhe Asp 260 265 270 Ser Leu Tyr Phe Cys Ile Val Thr Phe Ser Thr Val GlyPhe Gly 275 280 285 Asp Val Thr Pro Glu Thr Trp Ser Ser Lys Leu Phe ValVal Ala 290 295 300 Met Ile Cys Val Ala Leu Val Val Leu Pro Ile Gln PheGlu Gln 305 310 315 Leu Ala Tyr Leu Trp Met Glu Arg Gln Lys Ser Gly GlyAsn Tyr 320 325 330 Ser Arg His Arg Ala Gln Thr Glu Lys His Val Val LeuCys Val 335 340 345 Ser Ser Leu Lys Ile Asp Leu Leu Met Asp Phe Leu AsnGlu Phe 350 355 360 Tyr Ala His Pro Arg Leu Gln Asp Tyr Tyr Val Val IleLeu Cys 365 370 375 Pro Thr Glu Met Asp Val Gln Val Arg Arg Val Leu GlnIle Pro 380 385 390 Met Trp Ser Gln Arg Val Ile Tyr Leu Gln Gly Ser AlaLeu Lys 395 400 405 Asp Gln Asp Leu Leu Arg Ala Lys Met Asp Asp Ala GluAla Cys 410 415 420 Phe Ile Leu Ser Ser Arg Cys Glu Val Asp Arg Thr SerSer Asp 425 430 435 His Gln Thr Ile Leu Arg Ala Trp Ala Val Lys Asp PheAla Pro 440 445 450 Asn Cys Pro Leu Tyr Val Gln Ile Leu Lys Pro Glu AsnLys Phe 455 460 465 His Ile Lys Phe Ala Asp His Val Val Cys Glu Glu GluPhe Lys 470 475 480 Tyr Ala Met Leu Ala Leu Asn Cys Ile Cys Pro Ala ThrSer Thr 485 490 495 Leu Ile Thr Leu Leu Val His Thr Ser Arg Gly Gln CysVal Cys 500 505 510 Leu Cys Cys Arg Glu Gly Gln Gln Ser Pro Glu Gln TrpGln Lys 515 520 525 Met Tyr Gly Arg Cys Ser Gly Asn Glu Val Tyr His IleVal Leu 530 535 540 Glu Glu Ser Thr Phe Phe Ala Glu Tyr Glu Gly Lys SerPhe Thr 545 550 555 Tyr Ala Ser Phe His Ala His Lys Lys Phe Gly Val CysLeu Ile 560 565 570 Gly Val Arg Arg Glu Asp Asn Lys Asn Ile Leu Leu AsnPro Gly 575 580 585 Pro Arg Tyr Ile Met Asn Ser Thr Asp Ile Cys Phe TyrIle Asn 590 595 600 Ile Thr Lys Glu Glu Asn Ser Ala Phe Lys Asn Gln AspGln Gln 605 610 615 Arg Lys Ser Asn Val Ser Arg Ser Phe Tyr His Gly ProSer Arg 620 625 630 Leu Pro Val His Ser Ile Ile Ala Ser Met Gly Thr ValAla Ile 635 640 645 Asp Leu Gln Asp Thr Ser Cys Arg Ser Ala Ser Gly ProThr Leu 650 655 660 Ser Leu Pro Thr Glu Gly Ser Lys Glu Ile Arg Arg ProSer Ile 665 670 675 Ala Pro Val Leu Glu Val Ala Asp Thr Ser Ser Ile GlnThr Cys 680 685 690 Asp Leu Leu Ser Asp Gln Ser Glu Asp Glu Thr Thr ProAsp Glu 695 700 705 Glu Met Ser Ser Asn Leu Glu Tyr Ala Lys Gly Tyr ProPro Tyr 710 715 720 Ser Pro Tyr Ile Gly Ser Ser Pro Thr Phe Cys His LeuLeu His 725 730 735 Glu Lys Val Pro Phe Cys Cys Leu Arg Leu Asp Lys SerCys Gln 740 745 750 His Asn Tyr Tyr Glu Asp Ala Lys Ala Tyr Gly Phe LysAsn Lys 755 760 765 Leu Ile Ile Val Ala Ala Glu Thr Ala Gly Asn Gly LeuTyr Asn 770 775 780 Phe Ile Val Pro Leu Arg Ala Tyr Tyr Arg Pro Lys LysGlu Leu 785 790 795 Asn Pro Ile Val Leu Leu Leu Asp Asn Pro Pro Asp MetHis Phe 800 805 810 Leu Asp Ala Ile Cys Trp Phe Pro Met Val Tyr Tyr MetVal Gly 815 820 825 Ser Ile Asp Asn Leu Asp Asp Leu Leu Arg Cys Gly ValThr Phe 830 835 840 Ala Ala Asn Met Val Val Val Asp Lys Glu Ser Thr MetSer Ala 845 850 855 Glu Glu Asp Tyr Met Ala Asp Ala Lys Thr Ile Val AsnVal Gln 860 865 870 Thr Leu Phe Arg Leu Phe Ser Ser Leu Ser Ile Ile ThrGlu Leu 875 880 885 Thr His Pro Ala Asn Met Arg Phe Met Gln Phe Arg AlaLys Asp 890 895 900 Cys Tyr Ser Leu Ala Leu Ser Lys Leu Glu Lys Lys GluArg Glu 905 910 915 Arg Gly Ser Asn Leu Ala Phe Met Phe Arg Leu Pro PheAla Ala 920 925 930 Gly Arg Val Phe Ser Ile Ser Met Leu Asp Thr Leu LeuTyr Gln 935 940 945 Ser Phe Val Lys Asp Tyr Met Ile Ser Ile Thr Arg LeuLeu Leu 950 955 960 Gly Leu Asp Thr Thr Pro Gly Ser Gly Phe Leu Cys SerMet Lys 965 970 975 Ile Thr Ala Asp Asp Leu Trp Ile Arg Thr Tyr Ala ArgLeu Tyr 980 985 990 Gln Lys Leu Cys Ser Ser Thr Gly Asp Val Pro Ile GlyIle Tyr 995 1000 1005 Arg Thr Glu Ser Gln Lys Leu Thr Thr Ser Glu SerGln Ile Ser 1010 1015 1020 Ile Ser Val Glu Glu Trp Glu Asp Thr Lys AspSer Lys Glu Gln 1025 1030 1035 Gly His His Arg Ser Asn His Arg Asn SerThr Ser Ser Asp Gln 1040 1045 1050 Ser Asp His Pro Leu Leu Arg Arg LysSer Met Gln Trp Ala Arg 1055 1060 1065 Arg Leu Ser Arg Lys Gly Pro LysHis Ser Gly Lys Thr Ala Glu 1070 1075 1080 Lys Ile Thr Gln Gln Arg LeuAsn Leu Tyr Arg Arg Ser Glu Arg 1085 1090 1095 Gln Glu Leu Ala Glu LeuVal Lys Asn Arg Met Lys His Leu Gly 1100 1105 1110 Leu Ser Thr Val GlyTyr Asp Glu Met Asn Asp His Gln Ser Thr 1115 1120 1125 Leu Ser Tyr IleLeu Ile Asn Pro Ser Pro Asp Thr Arg Ile Glu 1130 1135 1140 Leu Asn AspVal Val Tyr Leu Ile Arg Pro Asp Pro Leu Ala Tyr 1145 1150 1155 Leu ProAsn Ser Glu Pro Ser Arg Arg Asn Ser Ile Cys Asn Val 1160 1165 1170 ThrGly Gln Asp Ser Arg Glu Glu Thr Gln Leu 1175 1180 20 233 PRT Homosapiens misc_feature Incyte ID No 7506408CD1 20 Met Leu Glu Gly Ala GluLeu Tyr Phe Asn Val Asp His Gly Tyr 1 5 10 15 Leu Glu Gly Leu Val ArgGly Cys Lys Ala Ser Leu Leu Thr Gln 20 25 30 Gln Asp Tyr Ile Asn Leu ValGln Cys Glu Thr Leu Glu Ala Pro 35 40 45 Phe Phe Gln Asp Cys Met Ser GluAsn Ala Leu Asp Glu Leu Asn 50 55 60 Ile Glu Leu Leu Arg Asn Lys Leu TyrLys Ser Tyr Leu Glu Ala 65 70 75 Phe Tyr Lys Phe Cys Lys Asn His Gly AspVal Thr Ala Glu Val 80 85 90 Met Cys Pro Ile Leu Glu Phe Glu Ala Asp ArgArg Ala Phe Ile 95 100 105 Ile Thr Leu Asn Ser Phe Gly Thr Glu Leu SerLys Glu Asp Arg 110 115 120 Glu Thr Leu Tyr Pro Thr Phe Gly Lys Leu TyrPro Glu Gly Leu 125 130 135 Arg Leu Leu Ala Gln Ala Glu Asp Phe Asp GlnMet Lys Asn Val 140 145 150 Ala Asp His Tyr Gly Val Tyr Lys Pro Leu PheGlu Ala Val Gly 155 160 165 Gly Ser Gly Gly Lys Thr Leu Glu Asp Val PheTyr Glu Arg Glu 170 175 180 Val Gln Met Asn Val Leu Ala Phe Asn Arg GlnPhe His Tyr Gly 185 190 195 Val Phe Tyr Ala Tyr Val Lys Leu Lys Glu GlnGlu Ile Arg Asn 200 205 210 Ile Val Trp Ile Ala Glu Cys Ile Ser Gln ArgHis Arg Thr Lys 215 220 225 Ile Asn Ser Tyr Ile Pro Ile Leu 230 21 2232DNA Homo sapiens misc_feature Incyte ID No 6911460CB1 21 attagctttgcccgaagttt ttccccacac tcttctttag catgctatta tggggaaagt 60 gaccactcctgggagcgggg gtggtcgggg cggtttggtg gcggggaagc ggctgtaact 120 tctacgtgaccatggtacct gttgaaaaca ccgagggccc cagtctgctg aaccagaagg 180 ggacagccgtggagacggag ggcagcggca gccggcatcc tccctgggcg agaggctgcg 240 gcatgtttaccttcctgtca tctgtcactg ctgctgtcag tggcctcctg gtgggttatg 300 aacttgggatcatctctggg gctcttcttc agatcaaaac cttattagcc ctgagctgcc 360 atgagcaggaaatggttgtg agctccctcg tcattggagc cctccttgcc tcactcaccg 420 gaggggtcctgatagacaga tatggaagaa ggacagcaat catcttgtca tcctgcctgc 480 ttggactcggaagcttagtc ttgatcctca gtttatccta cacggttctt atagtgggac 540 gcattgccataggggtctcc atctccctct cttccattgc cacttgtgtt tacatcgcag 600 agattgctcctcaacacaga agaggccttc ttgtgtcact gaatgagctg atgattgtca 660 tcggcattctttctgcctat atttcaaatt acgcatttgc caatgttttc catggctgga 720 agtacatgtttggtcttgtg attcccttgg gagttttgca agcaattgca atgtattttc 780 ttcctccaagccctcggttt ctggtgatga aaggacaaga gggagctgct agcaaggttc 840 ttggaaggttaagagcactc tcagatacaa ctgaggaact cactgtgatc aaatcctccc 900 tgaaagatgaatatcagtac agtttttggg atctgtttcg ttcaaaagac aacatgcgga 960 cccgaataatgataggacta acactagtat tttttgtaca aatcactggc caaccaaaca 1020 tattgttctatgcatcaact gttttgaagt cagttggatt tcaaagcaat gaggcagcta 1080 gcctcgcctccactggggtt ggagtcgtca aggtcattag caccatccct gccactcttc 1140 ttgtagaccatgtcggcagc aaaacattcc tctgcattgg ctcctctgtg atggcagctt 1200 cgttggtgaccatgggcatc gtaaatctca acatccacat gaacttcacc catatctgca 1260 gaagccacaattctatcaac cagtccttgg atgagtctgt gatttatgga ccaggaaacc 1320 tgtcaaccaacaacaatact ctcagagacc acttcaaagg gatttcttcc catagcagaa 1380 gctcactcatgcccctgaga aatgatgtgg ataagagagg ggagacgacc tcagcatcct 1440 tgctaaatgctggattaagc cacactgaat accagatagt cacagaccct ggggacgtcc 1500 cagcttttttgaaatggctg tccttagcca gcttgcttgt ttatgttgct gctttttcaa 1560 ttggtctaggaccaatgccc tggctggtgc tcagcgagat ctttcctggt gggatcagag 1620 gacgagccatggctttaact tctagcatga actggggcat caatctcctc atctcgctga 1680 catttttgactgtaactgat cttattggcc tgccatgggt gtgctttata tatacaatca 1740 tgagtctagcatccctgctt tttgttgtta tgtttatacc tgagacaaag ggatgctctt 1800 tggaacaaatatcaatggag ctagcaaaag tgaactatgt gaaaaacaac atttgtttta 1860 tgagtcatcaccaagaagaa ttagtgccaa aacagcctca aaaaagaaaa ccccaggagc 1920 agctcttggagtgtaacaag ctgtgtggta ggggccaatc caggcagctt tctccagaga 1980 cctaatggcctcaacacctt ctgaacgtgg atagtgccag aacacttagg agggtgtctt 2040 tggaccaatgcatagttgcg actcctgtgc tctcttttca gtgtcatgga actggttttg 2100 aagagacactctgaaatgat aaagacagcc tttaatcccc ctcctcccca gaaggaacct 2160 caaaaggtagatgaggtaca aggtcctaag tgatctcttt ttctgagcag gatatcaggt 2220 taaaaaaaaaaa 2232 22 4135 DNA Homo sapiens misc_feature Incyte ID No 55138203CB122 acaaccccac aggccagctt tttcacatag ttgttaccag cacttggcca acagttgttt 60ttcatcagtg ggtggagcag cttttcttgc ccccaaaaaa cagtcaacca ctcatttttc 120attgggtata tgtattcggc aaacattggg tacctgctgt ttgttggcac tggtgttgag 180aagatgaata acacaccctc tatggcccta gggagttccc attctggtag ggggaacctg 240actcaggcag caacaaaacc ttctggttat gagaagacag atgatgtttc agagaagacc 300tcactggctg accaggagga agtaaggact attttcatca accagcccca gctgacaaaa 360ttctgcaata accatgtcag cactgcaaaa tacaacataa tcacattcct tccaagattt 420ctctactctc agttcagaag agctgctaat tcattttttc tctttattgc actgctgcag 480caaatacctg atgtgtcacc aacaggtcgt tatacaacac tggttcctct cttatttatt 540ttagctgtgg cagctatcaa agagataata gaagatatta aacgacataa agctgataat 600gcagtgaaca agaaacaaac gcaagttttg agaaatggtg cttgggaaat tgtccactgg 660gaaaaggtaa atgttggaga tatagttata ataaaaggca aagagtatat acctgctgac 720actgtacttc tctcatcaag tgagccccaa gccatgtgct acattgaaac atccaactta 780gatggtgaaa caaacttgaa aattagacag ggcttaccag caacatcaga tatcaaagac 840gttgacagtt tgatgaggat ttctggcaga attgagtgtg aaagtccaaa cagacatctc 900tacgattttg ttggaaacat aaggcttgat ggacatggca ccgttccact gggagcagat 960cagattcttc ttcgaggagc tcagttgaga aatacacagt gggttcatgg aatagttgtc 1020tacactggac atgacaccaa gctgatgcag aattcaacaa gtccaccact taagctctca 1080aatgtggaac ggattacaaa tgtacaaatt ttgattttat tttgtatctt aattgccatg 1140tctcttgtct gttctgtggg ctcagccatt tggaatcgaa ggcattctgg aaaagactgg 1200tatctcaatc taaactatgg tggcgctagt aattttggac tgaatttctt gaccttcatc 1260atccttttca acaatctcat tcctatcagc ttattggtta cattagaagt tgtgaaattt 1320acccaggcat acttcataaa ttgggatctt gacatgcact atgaacccac agacactgct 1380gctatggctc gaacatctaa tctgaatgag gaacttggcc aggttaaata catattttct 1440gacaaaactg gtactctgac atgcaatgta atgcagttta agaagtgcac catagcggga 1500gttgcttatg gccatgtccc tgaacctgag gattatggct gctctcctga tgaatggcag 1560aactcacagt ttggagatga aaaaacattt agtgattcat cattgctgga aaatctccaa 1620aataatcatc ccaccgcacc tataatatgt gaatttctta caatgatggc agtctgtcac 1680acagcagtgc cagagcgaga aggtgacaag attatttatc aagcagcatc tccagatgag 1740ggagcattgg tcagagcagc caagcaattg aattttgttt tcactggaag aacacccgac 1800tcggtgatta tagattcact ggggcaggaa gaaagatatg aattgctcaa tgtcttggag 1860tttaccagtg ctaggaaaag aatgtcagtg attgttcgca ctccatctgg aaagttacga 1920ctctactgca aaggagctga cactgtaatt tatgatcgac tggcagagac gtcaaaatac 1980aaagaaatta ccctaaaaca tttagagcag tttgctacag aagggttaag aactttatgt 2040tttgctgtgg ctgagatttc agagagcgac tttcaggagt ggcgagcagt ctatcagcga 2100gcatctacat ctgtgcagaa caggctactc aaactcgaag agagttatga gttgattgaa 2160aagaatcttc agctacttgg agcaacagcc attgaggata aattacaaga tcaagtgcct 2220gaaaccatag aaacgctaat gaaagcagac atcaaaatct ggatccttac aggggacaag 2280caagaaactg ccattaacat cggacactcc tgcaaactgt tgaagaagaa catgggaatg 2340attgttataa atgaaggctc tcttgatgga acaagggaaa ctctcagtcg tcactgtact 2400acccttggtg atgctctccg gaaagagaat gattttgctc ttataattga tgggaaaacc 2460ctcaaatatg ccttaacctt tggagtacga cagtatttcc tggacttagc tttgtcatgc 2520aaagctgtca tttgctgtcg ggtttctcct cttcaaaaat ctgaagttgt tgagatggtt 2580aagaaacaag tcaaagtcgt aacgcttgca atcggtgatg gagcaaatga tgtcagcatg 2640atacagacag cgcacgttgg tgttggtatc agtggcaatg aaggcctgca ggcagctaat 2700tcctctgact actccatagc tcagttcaaa tatttgaaga atttactgat gattcatggt 2760gcctggaact ataacagagt ctccaagtgc atcttatact gcttctacaa gaatatagtg 2820ctctatatta tcgagatctg gtttgccttt gttaatggct tttctggaca gatcctcttt 2880gaaagatggt gtataggtct ctataacgtg atgtttacag caatgcctcc tttaactctt 2940ggaatatttg agagatcatg cagaaaagag aacatgttga agtaccctga attatacaaa 3000acatctcaga atgccctgga cttcaacacc aaggttttct gggttcattg tttaaatggc 3060ctcttccact cagttattct gttttggttt ccactaaaag cccttcagta tggtactgca 3120tttggaaatg ggaaaacctc ggattatctg ctactgggaa actttgtgta cacttttgtg 3180gtgataactg tgtgtttgaa agctggattg gagacatcat attggacatg gttcagccac 3240atagcgatat gggggagcat cgcactctgg gtggtgtttt tgggaatcta ctcatctctg 3300tggcctgcca ttccgatggc ccctgatatg tcaggagagg cagccatgtt gttcagttct 3360ggagtctttt ggatgggctt gttattcatc cctgtggcat ctctgctcct tgatgtggtg 3420tacaaggtta tcaagaggac tgcttttaaa acattggtcg atgaagttca ggagctggag 3480gcaaaatctc aagacccagg agcagttgta cttggaaaaa gcctgaccga gagggcgcaa 3540ctgctcaaga acgtctttaa gaagaaccac gtgaacttgt accgctctga atccttgcaa 3600caaaatctgc tccatgggta tgcgttctct caagatgaaa atggaatcgt ttcacagtct 3660gaagtgataa gagcatatga taccacgaaa cagaggcccg acgaatggtg atggggagag 3720cctgaaaggc aggctctgtt acctctctaa ggagagctac caggttgtca ccgcagtctg 3780ctaaccaatt ccagtctggt ccatgaagag gaaaggtaga tctgagctca tctcgctgat 3840ggacattcag attcatgtat attatagaca taagcactgt gcaactgtac tgtaacacca 3900tctcttttgg atttttttaa ggtatttgct aagtctttgt aaacggaaat tgaaaatgac 3960ctggtatctt gccagagggc tttcttaaac ggagaataag tcagtattct tatgccatta 4020ctgtggggct gtaactgact gtcagtttat tggctgtacc acaaggtaac caaccattaa 4080aaaactctaa atgatattta gttaaaggga ctctgtggta tccagactta gattt 4135 232970 DNA Homo sapiens misc_feature Incyte ID No 7478871CB1 23 atgcaaccagccagagggcc cctggcttca gaacctagga ctgtactggt tctgagattc 60 tgtgcaagcctcatggaaat gaagctgcca ggccaggaag ggtttgaagc ctccagtgct 120 cctagaaatattccttcagg ggagctggac agcaaccctg accctggcac cggccccagc 180 cctgatggcccctcagacac agagagcaag gaactgggag tacccaaaga ccctctgctc 240 ttcattcagctgaatgagct gctgggctgg ccccaggcgc tggagtggag agagacaggc 300 acgtgggtactgtttgagga gaagttggag gtggctgcag gccggtggag tgccccccac 360 gtgcccaccctggcactgcc cagcctccag aagctccgca gcctgctggc cgagggcctt 420 gtactgctggactgcccagc tcagagcctc ctggagctcg tggagcaggt gaccagggtg 480 gagtcgctgagcccagagct gagagggcag ttgcaggcct tgctgctgca gagaccccag 540 cattacaaccagaccacagg caccaggccc tgctggggtg agagcccctc cctgggccca 600 ggaccaagaccctgtacaac cagaccacag gcaccaggcc ctgctgggca gtgtcagaac 660 cccctgagacagaagctacc tccaggagct gaggcaggga ctgtgctggc aggggagctg 720 ggcttcctggcacagccact gggagccttt gttcgactgc ggaaccctgt ggtactgggg 780 tcccttactgaggtgtccct cccaagcagg tttttctgcc ttctcctggg cccctgtatg 840 ctgggaaagggctaccatga gatgggacgg gcagcagctg tcctcctcag tgacccgcaa 900 ttccagtggtcagttcgtcg ggccagcaac cttcatgacc ttctggcagc cctggatgca 960 ttcctagaggaggtgacagt gcttccccca ggtcggtggg acccaacagc ccggattccc 1020 ccgcccaaatgtctgccatc tcagcacaaa aggcttccct cgcaacagcg ggagatcaga 1080 ggtcccgccgtcccgcgcct gacctcggct gaggacaggc accgccatgg gccacacgca 1140 cacagcccggagttgcagcg gaccggcagg ctgtttgggg gccttatcca ggacgtgcgc 1200 aggaaggtcccgtggtaccc cagcgatttc ttggacgccc tgcatctcca gtgcttctcg 1260 gccgtactctacatttacct ggccactgtc actaatgcca tcacttttgg gggtctgctg 1320 ggagatgccactgatggtgc ccagggagtg ctggaaagtt tcctgggcac agcagtggct 1380 ggagctgccttctgcctgat ggcaggccag cccctcacca ttctgagcag cacggggcca 1440 gtgctggtctttgagcgcct gctcttctct ttcagcagag attacagcct ggactacctg 1500 cccttccgcctatgggtggg catctgggtg gctacctttt gcctggtgct ggtggccaca 1560 gaggccagtgtgctggtgcg ctacttcacc cgcttcactg aggaaggttt ctgtgccctc 1620 atcagcctcatcttcatcta cgatgctgtg ggcaaaatgc tgaacttgac ccatacctat 1680 cctatccagaagcctgggtc ctctgcctac gggtgcctct gccaataccc aggcccagga 1740 ggaaatgagtctcaatggat aaggacaagg ccaaaagaca gagacgacat tgtaagcatg 1800 gacttaggcctgatcaatgc atccttgctg ccgccacctg agtgcacccg gcagggaggc 1860 caccctcgtggccctggctg tcatacagtc ccagacattg ccttcttctc ccttctcctc 1920 ttccttacttctttcttctt tgctatggcc ctcaagtgtg taaagaccag ccgcttcttc 1980 ccctctgtggtgcgcaaagg gctcagcgac ttctcctcag tcctggccat cctgctcggc 2040 tgtggccttgatgctttcct gggcctagcc acaccaaagc tcatggtacc cagagagttc 2100 aagcccacactccctgggcg tggctggctg gtgtcacctt ttggagccaa cccctggtgg 2160 tggagtgtggcagctgccct gcctgccctg ctgctgtcta tcctcatctt catggaccaa 2220 cagatcacagcagtcatcct caaccgcatg gaatacagac tgcagaaggg agctggcttc 2280 cacctggacctcttctgtgt ggctgtgctg atgctactca catcagcgct tggactgcct 2340 tggtatgtctcagccactgt catctccctg gctcacatgg acagtcttcg gagagagagc 2400 agagcctgtgcccccgggga gcgccccaac ttcctgggta tcagggaaca gaggctgaca 2460 ggcctggtggtgttcatcct tacaggagcc tccatcttcc tggcacctgt gctcaagttc 2520 attccaatgcctgtgctcta tggcatcttc ctgtatatgg gggtggcagc gctcagcagc 2580 attcagttcactaatagggt gaagctgttg ttgatgccag caaaacacca gccagacctg 2640 ctactcttgcggcatgtgcc tctgaccagg gtccacctct tcacagccat ccagcttgcc 2700 tgtctggggctgctttggat aatcaagtct acccctgcag ccatcatctt ccccctcatg 2760 ttgctgggccttgtgggggt ccgaaaggcc ctggagaggg tcttctcacc acaggaactc 2820 ctctggctggatgagctgat gccagaggag gagagaagca tccctgagaa ggggctggag 2880 ccagaacactcattcagtgg aagtgacagt gaagattcag agctgatgta tcagccaaag 2940 gctccagaaatcaacatttc tgtgaattag 2970 24 1835 DNA Homo sapiens misc_feature IncyteID No 7483601CB1 24 atggatcatg ctgaagaaaa tgaaatcctt gcagcaacccagaggtacta tgtggaaagg 60 cctatcttta gtcatccggt cctccaggaa agactacacacaaaggacaa ggttcctgat 120 tccattgcgg ataagctgaa acaggcattc acatgtactcctaaaaaaat aagaaatatc 180 atttatatgt tcctacccat aactaaatgg ctgccagcatacaaattcaa ggaatatgtg 240 ttgggtgact tggtctcagg cataagcaca ggggtgcttcagcttcctca aggcttagcc 300 tttgcaatgc tggcagctgt gcctccaata tttggcctgtacccttcatt ttaccctgtt 360 atcatgtatt gttttcttgg aacctccaga cacatatccataggtccttt tgctgttatt 420 agcctgatga ttggtggtgt agctgttcga ttagtaccagatgatatagt cattccagga 480 ggagtaaatg caaccaatgg cacagaggcc agagatgccttgagagtgaa agtcgccatg 540 tctgtgacct tactttcagg aatcattcag ttttgcctaggtgtctgtag gtttggattt 600 gtggccatat atctcacaga gcctctggtc cgtgggtttaccaccgcagc agctgtgcat 660 gtcttcacct ccatgttaaa atatctgttt ggagttaaaacaaagcggta cagtggaatc 720 ttttccgtgg tgtatagtac agttgctgtg ttgcagaatgttaaaaacct caacgtgtgt 780 tccctaggcg tcgggctgat ggtttttggt ttgctgttgggtggcaagga gtttaatgag 840 agatttaaag agaaattgcc ggcgcctatt cctttagagttctttgcggt cgtaatggga 900 actggcattt cagctgggtt taacttgaaa gaatcatacaatgtggatgt cgttggaaca 960 cttcctctag ggctgctacc tccagccaat ccggacaccagcctcttcca ccttgtgtac 1020 gtagatgcca ttgccatagc catcgttgga ttttcagtgaccatctccat ggccaagacc 1080 ttagcaaata aacatggcta ccaggttgac ggcaatcaggagctcattgc cctgggactg 1140 tgcaattcca ttggctcact cttccagacc ttttcaatttcatgctcctt gtctcgaagc 1200 cttgttcagg agggaaccgg tgggaagaca cagcttgcaggttgtttggc ctcattaatg 1260 attctgctgg tcatattagc aactggattc ctctttgaatcattgcccca ggctgtgctg 1320 tcggccattg tgattgtcaa cctgaaggga atgtttatgcagttctcaga tctccccttt 1380 ttctggagaa ccagcaaaat agagctgacc atctggcttaccacttttgt gtcctccttg 1440 ttcctgggat tggactatgg tttgatcact gctgtgatcattgctctgct gactgtgatt 1500 tacagaacac agaggtgaaa gaaattcctg gaataaaaatatttcaaata aatgccccaa 1560 tttactatgc aaatagggac tgtatagcca agcttaaaagaaagactggg gtgaacccag 1620 cagtcatcat ggggacaggg gaaaggcgtg gggaatacgctaagggagtc ggaatggaaa 1680 tgggcacggc atgtggtaag cgatgcggag tatgggggttaacaagcgaa aaggggtgga 1740 gaaaattccc aatgtaaaag attttggaag gagaatgacccggaagacac aagttgggtt 1800 tacaataggt tggggagacg gcggaaagag ggtta 183525 2220 DNA Homo sapiens misc_feature Incyte ID No 7487851CB1 25caaggcagca tgagccgatc acccctcaat cccagccaac tccgatcagt gggctcccag 60gatgccctgg cccccttgcc tccacctgct ccccagaatc cctccaccca ctcttgggac 120cctttgtgtg gatctctgcc ttggggcctc agctgtcttc tggctctgca gcatgtcttg 180gtcatggctt ctctgctctg tgtctcccac ctgctcctgc tttgcagtct ctccccagga 240ggactctctt actccccttc tcagctcctg gcctccagct tcttttcacg tggtatgtct 300accatcctgc aaacttggat gggcagcagg ctgcctcttg tccaggctcc atccttagag 360ttccttatcc ctgctctggt gctgaccagc cagaagctac cccgggccat ccagacacct 420ggaaactgtg agcacagagc aagggcaagg gcctccctca tgctgcacct ttgtagggga 480cctagctgcc atggcctggg gcactggaac acttctctcc aggaggtgtc cggggcagtg 540gtagtatctg ggctgctgca gggcatgatg gggctgctgg ggagtcccgg ccacgtgttc 600ccccactgtg ggcccctggt gctggctccc agcctggttg tggcagggct ctctgcccac 660agggaggtag cccagttctg cttcacacac tgggggttgg ccttgctggt tatcctgctc 720atggtggtct gttctcagca cctgggctcc tgccagtttc atgtgtgccc ctggaggcga 780gcttcaacgt catcaactca cactcctctc cctgtcttcc ggctcctttc ggtgctgatc 840ccagtggcct gtgtgtggat tgtttctgcc tttgtgggat tcagtgttat cccccaggaa 900ctgtctgccc ccaccaaggc accatggatt tggctgcctc acccaggtga gtggaattgg 960cctttgctga cgcccagagc tctggctgca ggcatctcca tggccttggc agcctccacc 1020agttccctgg gctgctatgc cctgtgtggc cggctgctgc atttgcctcc cccacctcca 1080catgcctgca gtcgagggct gagcctggag gggctgggca gtgtgctggc cgggctgctg 1140ggaagcccca tgggcactgc atccagcttc cccaacgtgg gcaaagtggg tcttatccag 1200gctggatctc agcaagtggc tcacttagtg gggctactct gcgtggggct tggactctcc 1260cccaggttgg ctcagctcct caccaccatc ccactgcctg ttgttggtgg ggtgctgggg 1320gtgacccagg ctgtggtttt gtctgctgga ttctccagct tctacctggc tgacatagac 1380tctgggcgaa atatcttcat tgtgggcttc tccatcttca tggccttgct gctgccaaga 1440tggtttcggg aagccccagt cctgttcagc acaggctgga gccccttgga tgtattactg 1500cactcactgc tgacacagcc catcttcctg gctggactct caggcttcct actagagaac 1560acgattcctg gcacacagct tgagcgaggc ctaggtcaag ggctaccatc tcctttcact 1620gcccaagagg ctcgaatgcc tcagaagccc agggagaagg ctgctcaagt gtacagactt 1680cctttcccca tccaaaacct ctgtccctgc atcccccagc ctctccactg cctctgccca 1740ctgcctgaag accctgggga tgaggaagga ggctcctctg agccagaaga gatggcagac 1800ttgctgcctg gctcagggga gccatgccct gaatctagca gagaagggtt taggtcccag 1860aaatgaccag aacgcctact tctgccttgg ttaatttagc cctaactctc atctgctgga 1920gagtcagctc ccaaactgtt ctttcttgta ggcagaggat atgtgtgtgt gtattacatg 1980ggactgtcta gaggttccat ttcccaatag ggtgggttgc ctttccttgt cttaattagg 2040cctaactgtt ccagagcaga ggccatgatt tagtggacca tgaatgattg agattttgcc 2100tgtgtactat caatgccact tgaacccaag cattcacttt aatacttact gagcatctcc 2160catgtgcaag gtcctggaac tacagggata agacagggtc catgccgtct caaggcattt 222026 1517 DNA Homo sapiens misc_feature Incyte ID No 7472881CB1 26taagaacaga agtggaaagc cttacttacc acagtttatt atatgtttca tgcccgtgat 60aattactttt ataatgccac ttgtgaaaaa attgatcaga ttaggatgaa tcaccttgct 120ggccaacagt tattggaatg attctccatg tgtgacttcg ttgcactatt acaaaatgtg 180gcaggataga cctgcccagc cattgttgcc gatgttcatt tgtaatgctg ccttaaggag 240atgaggagat gagagccaat tgttccagca gctcagcctg ccctgccaac agttcagagg 300aggagctgcc agtgggactg gaggcgcatg gaaacctgga gctcgttttc acagtggtgc 360ccactgtgat gatggggctg ctcatgttct ctttgggatg ttccgtggag atccggaagc 420tgtggtcgca catcaggaga ccctggggca ttgctgtggg actgctctgc cagtttgggc 480tcatgccttt tacagcttat ctcctggcca ttagcttttc tctgaagcca gtccaagcta 540ttgctgttct catcatgggc tgctgcccgg ggggcaccat ctctaacatt ttcaccttct 600gggttgatgg agatatggat ctcagcatca gtatgacaac ctgttccacc gtggccgccc 660tgggaatgat gccactctgc atttatctct acacctggtc ctggagtctt cagcagaatc 720tcaccattcc ttatcagaac ataggaatta cccttgtgtg cctgaccatt cctgtggcct 780ttggtgtcta tgtgaattac agatggccaa aacaatccaa aatcattctc aagattgggg 840ccgttgttgg tggggtcctc cttctggtgg tcgcagttgc tggtgtggtc ctggcgaaag 900gatcttggaa ttcagacatc acccttctga ccatcagttt catctttcct ttgattggcc 960atgtcacggg ttttctgctg gcacttttta cccaccagtc ttggcaaagg tgcaggacaa 1020tttccttaga aactggagct cagaatattc agatgtgcat caccatgctc cagttatctt 1080tcactgctga gcacttggtc cagatgttga gtttcccact ggcctatgga ctcttccagc 1140tgatagatgg atttcttatt gttgcagcat atcagacgta caagaggaga ttgaagaaca 1200aacatggaaa aaagaactca ggttgcacag aagtctgcca tacgaggaaa tcgacttctt 1260ccagagagac caatgccttc ttggaggtga atgaagaagg tgccatcact cctgggccac 1320cagggccaat ggattgccac agggctctcg agccagttgg ccacatcact tcatgtgaat 1380agcagggact agctggctgg actggccccc ttctttttca gtggccagta aagacagtgt 1440gcagctgaca catgaatctt gttggtaggg ccagtgtgaa tatttaagtg ttcaatgtta 1500gaatatttat attttca 1517 27 2142 DNA Homo sapiens misc_feature Incyte IDNo 7612560CB1 27 ggtgtacatc tacactagac accttcctgc ttccctcctt ccagagcagacctctttgtc 60 accccgagct ccttgtttct taagcagtca tgtctgtgac aaaaagtactgagggtcccc 120 agggagccgt tgccatcaaa ttggacctta tgtcgcctcc tgaaagtgccaagaagttgg 180 agaacaagga ctctacattc ttggatgaaa gtccttcaga gtcagcaggcttgaagaaga 240 ccaagggcat aacagtgttc caggccttga ttcacctggt gaaaggcaacatgggcacag 300 ggatcctggg actacccctc gctgtgaaga acgcgggcat cctgatgggcccactcagtc 360 tgctggtgat gggcttcatt gcctgccact gtatgcacat cctggtcaagtgtgcccagc 420 gcttctgtaa gaggcttaac aagcccttta tggactatgg ggacacggtgatgcatggac 480 tagaagccaa ccccaacgcc tggctccaga atcacgctca ctggggaaggcatatcgtga 540 gcttcttcct tattatcacc caacttggct tctgctgtgt gtacattgtgtttttggctg 600 ataatttaaa acaggtagtg gaagctgtta atagcacaac caacaactgctattccaatg 660 agacggtgat tctgaccccc accatggact cgcgactcta catgctctccttcctgccct 720 tcctggtgct gctggtcctc atccggaacc tcaggatctt gaccatcttctccatgctgg 780 ccaacatcag catgctggtc agcttggtca tcatcataca gtacattacccaggaaatcc 840 cagaccccag ccggttgcca ctggtagcaa gctggaagac ctaccctctcttcttcggaa 900 cagccatttt ttcttttgaa agcattggtg tggttctgcc tctggaaaacaagatgaaga 960 atgcccgcca cttcccagcc atcctgtctt tgggaatgtc catcgtcacttccctataca 1020 ttggcatggc ggctctgggc tacctgcggt ttggagatga catcaaggccagcataagcc 1080 ttaacctgcc taactgctgg ctgtaccagt ctgtcaagct tctctacattgccggcatcc 1140 tgtgcaccta tgccctgcag ttctacgtcc ctgcagaaat catcatcccctttgccatct 1200 cccgggtgtc aacacgctgg gcactgcctc tggatctgtc cattcgcctcgtcatggtct 1260 gcctgacatg cctcctggcc atcctcatcc cccgcctgga cctggtcatctccctggtgg 1320 gctccgtgag tggcaccgcc ctggccctca tcatcccacc gctcctggaggtcaccacgt 1380 tctactcaga gggcatgagc cccctcacca tcttcaagga cgtcctgatcagcatcctgg 1440 gcttcgtggg ctttgtggtg gggacctacc aggccctgga cgagctgctcaagtcagaag 1500 actctcaccc cttttccaac tccaccactt ttgttcgcgt ggagctatgcaagaagcagc 1560 caccagaggg ccccaagtgg cagcaactgg ccaaaggaga tgcagccagctaagactgtc 1620 cacactttgg cagacaaccg gttttccctt ttctgggtct gttcaaaaagcaaacattaa 1680 gggtgggcac ataatccaca agccagaaag ttgtgcacgg ctccagtgttgagatgggta 1740 gggccaagat gaccagtgtg aaaactctca gatagaaagg agccatgcatattaaatgag 1800 gggcaacaaa catttcaaac gattagataa cattttctcc caactcaaagatcccaacaa 1860 tgaataggag gcatggaagt agatgtgcca atggggaggg atgaggagtgaacatgaata 1920 ttatttgaat agactttacc tcttaattct tgcaacatgc attcttgattacctactgtg 1980 tgccaaacaa gattttgtag aatattgcaa aaatgaccat aaattcctcgtgataatgtg 2040 actttgcacc tgctcctatg aaaagatgaa gtctgtatct gtatcccttaaatttttttg 2100 cttgtttgtc ggttttgttt tgtgttttgt ttttttgaga tg 2142 281661 DNA Homo sapiens misc_feature Incyte ID No 2880370CB1 28 gacactaagctttaaattca agtaaatagg aggctttttt tttttcgcat aagcagaaat 60 gaggaaatcaagaggaagag attagatttc tgttgtgata aatcgaatct gttaaatgcc 120 atgactttttaattgtctta atcacaagtt aaaccggttg tgttgctgct tagatggcta 180 tatatttgtttaaaagtaca gcagtccctc ctactggact ttgatcctac aaaaacaact 240 gttatctaactcaccctcag actgtcactg gaacacctgc atgaagaatg ttctttcatt 300 ttttaaaaacgattttgcat atatgattta tttcagcttt caaaatgatt agaaaacttt 360 ttattgttctacttttgttg cttgtgacta tagaagaagc aaggatgtca tcgctcagtt 420 ttctgaatatagagaagact gaaatactat ttttcacaaa gactgaagaa accatccttg 480 taagttcaagctacgaaaat aaacggccta attccagcca cctctttgtg aaaatagaag 540 atcctaaaatactacaaatg gtgaatgtgg ccaagaagat ctcatcagat gctacaaact 600 ttaccataaatctggtgact gatgaagaag gagaaacaaa tgtgactatt caactctggg 660 attctgaaggtaggcaagaa agactcattg aagaaatcaa gaatgtgaaa gtcaaagtgc 720 tcaaacaaaaagacagtcta ctccaggcac caatgcatat tgatagaaat atcctaatgc 780 ttattttaccactaatacta ttgaataagt gtgcatttgg ttgtaagatt gaattacagc 840 tgtttcaaacagtatggaag agacctttgc cagtaattct tggggcagtt acacagtttt 900 ttctgatgccattttgcggg tttcttttgt ctcagattgt ggcattgcct gaggcgcaag 960 cttttggagttgtaatgacc tgcacgtgcc caggaggggg tgggggctat ctctttgctc 1020 tgcttctagatggagatttc acattggcca ttttgatgac ttgcacatca acattattgg 1080 ctctgatcatgatgcctgtc aattcttata tatacagtag gatattaggg ttgtcaggta 1140 cattccatattcctgtttct aaaattgtgt caacactcct tttcatactt gtgccagtat 1200 caattggaatagtcatcaag catagaatac ctgaaaaagc aagcttctta gagagaataa 1260 ttagacctctgagttttatt ttaatgttcg taggaattta tttgactttc acagtgggat 1320 tagtgttcttaaaaacagat aatctagagg tgattctgtt gggtctctta gttcctgctt 1380 tgggtttgctgtttgggtac tcctttgcta aagtttgtac gctgcctctt cctgtttgta 1440 aaactgttgctattgaaagt gggatgttaa atagtttctt agctcttgcc gttattcagc 1500 tgtcttttccacagtccaag gccaatttag cttctgtggc tccttttaca gtagccatgt 1560 gttctggatgtgaaatgtta ctgatcattc tagtttacaa ggctaagaaa agatgtatct 1620 ttttcttacaagataaaagg aaaagaaatt tcctaatcta a 1661 29 1501 DNA Homo sapiensmisc_feature Incyte ID No 6267489CB1 29 ccagaggaaa ctagtcacaa aaaccctgactatcacctga tagattgctt gtgctgcctg 60 ataattactc gcacttttcc caggctagtgcaaatcttca ggggccgtcc aggactacag 120 agctgtttca ccctaccttg gcttcaatctcttcccccat gctcgaaggt gcggagctgt 180 acttcaacgt ggaccatggc tacctggagggcctggttcg aggatgcaag gccagcctcc 240 tgacccagca agactatatc aacctggtccagtgtgagac cctagaagac ctgaaaattc 300 atctccagac tactgattat ggtaactttttggctaatca cacaaatcct cttactgttt 360 ccaaaattga cactgagatg aggaaaagactatgtggaga atttgagtat ttccggaatc 420 attccctgga gcccctcagc acatttctcacctatatgac gtgcagttat atgatagaca 480 atgtgattct gctgatgaat ggtgcattgcagaaaaaatc tgtgaaagaa attctgggga 540 agtgccaccc cttgggccgt ttcacagaaatggaagctgt caacattgca gagacacctt 600 cagatctctt taatgccatt ctgatcgaaacgccattagc tccattcttc caagactgca 660 tgtctgaaaa tgctctagat gaactgaatattgaattgct acgcaataaa ctatacaagt 720 cttaccttga ggcattctat aaattctgtaagaatcatgg tgatgtcaca gcagaagtta 780 tgtgtcccat tcttgagttt gaggccgacagacgtgcttt tatcatcact cttaactcct 840 ttggcactga attgagcaaa gaagaccgagagaccctcta tccaaccttc ggcaaactct 900 atcctgaggg gttgcggctg ttggctcaagcagaagactt tgaccagatg aagaacgtag 960 cggatcatta cggagtatac aaacctttatttgaagctgt aggtggcagt gggggaaaga 1020 cattggagga cgtgttttac gagcgtgaggtacaaatgaa tgtgctggca ttcaacagac 1080 agttccacta cggtgtgttt tatgcatatgtaaagctgaa ggaacaggaa attagaaata 1140 ttgtgtggat agcagaatgt atttcacagaggcatcgaac taaaatcaac agttacattc 1200 caattttata acccaagtaa ggttctcaaatgtagaaaat tataaatgtt aaaaggaagt 1260 tattgaagaa aataaaagaa attatgttatattatctaga ctacacaaaa gtaagccaca 1320 ctatatcttc atgagttgca aatccatggaaacacagtaa accagccctg aaacaaagca 1380 tttccttgtt ttcagtggta ttagatcttgtttccacatg tctgtctcat tcttcactgg 1440 gccttacagg ttagttttaa ttaactctatggtatttttc tattcttgtc tgatcatgtt 1500 a 1501 30 5526 DNA Homo sapiensmisc_feature Incyte ID No 7484777CB1 30 caggctgttt tgtgcaggct gtccctcttcttcaaaatcg tgcatcccct ccccgaagca 60 gcaggcagtg tgcctccatt cagccacatttggtatgcat gagcacggct gcagagagag 120 gggaggtggc tgttttaaga aggttcaggggctcaggcaa ggctacttga ctagtcttcc 180 aagttccagg aagcctctgc cctaatggaatttgcaggtg tggagatgac catgggatgc 240 cagagccgtg ggggaccgtt tattttctaggcattgctca ggttttcagt ttcttgtttt 300 cctggtggaa tttggaaggg gtcatgaatcaggctgatgc tcctcgaccc ctaaactgga 360 ccatccggaa gctgtgccac gcagcctttcttccatctgt cagacttctg aaggctcaga 420 aatcctggat agaaagagca ttttataaaagagaatgtgt ccacatcata cccagcacca 480 aagaccccca taggtgttgc tgtgggcgtctgataggcca gcatgttggc ctcaccccca 540 gtatctccgt gcttcagaat gagaaaaatgaaagtcgcct ctcccgaaat gacatccagt 600 ctgaaaagtg gtccatcagc aaacacactcaactcagccc tacggatgct tttgggacca 660 ttgagttcca aggaggtggc cattccaacaaagccatgta tgtgcgagta tcttttgata 720 caaaacctga tctcctctta cacctgatgaccaaggaatg gcagttggag cttcccaagc 780 ttctcatctc tgtccatggg ggcctgcagaactttgaact ccagccaaaa ctcaagcaag 840 tctttgggaa agggctcatc aaagcagctatgacaactgg agcgtggata ttcactggag 900 gggttaacac aggtgttatt cgtcatgttggcgatgcctt gaaggatcat gcctctaagt 960 ctcgaggaaa gatatgcacc ataggtattgccccctgggg aattgtggaa aaccaggagg 1020 acctcattgg aagagatgtt gtccggccataccagaccat gtccaatccc atgagcaagc 1080 tcactgttct caacagcatg cattcccacttcattctggc tgacaacggg accactggaa 1140 aatatggagc agaggtgaaa cttcgaagacaactggaaaa gcatatttca ctccagaaga 1200 taaacacaag aatcggtcaa ggtgttcctgtggtggcact catagtggaa ggaggaccca 1260 atgtgatctc gattgttttg gagtaccttcgagacacccc tcccgtgcca gtggttgtct 1320 gtgatgggag tggacgggca tcggacatcctggcctttgg gcataaatac tcagaagaag 1380 gcggactgat aaatgaatct ttgagggaccagctgttggt gactatacag aagactttca 1440 catacactcg aacccaagct cagcatctgttcatcatcct catggagtgc atgaagaaga 1500 aggaattgat tacggtattt cggatgggatcagaaggaca ccaggacatt gatttggcta 1560 tcctgacagc tttactcaaa ggagccaatgcctcggcccc agaccaactg agcttagctt 1620 tagcctggaa cagagtcgac atcgctcgcagccagatctt tatttacggg caacagtggc 1680 cggtgggatc tctggagcaa gccatgttggatgccttagt tctggacaga gtggattttg 1740 tgaaattact catagagaat ggagtaagcatgcaccgttt tctcaccatc tccagactag 1800 aggaattgta caatacgaga catgggccctcaaatacatt gtaccacttg gtcagggatg 1860 tcaaaaaggg gaacctgccc ccagactacagaatcagcct gattgacatc ggcctggtga 1920 tcgagtacct gatgggcggg gcttatcgctgcaactacac gcgcaagcgc ttccggaccc 1980 tctaccacaa cctcttcggc cccaagaggcccaaagcctt gaaactgctg ggaatggagg 2040 atgatattcc cttgaggcga ggaagaaagacaaccaagaa acgtgaagaa gaggtggaca 2100 ttgacttgga tgatcctgag atcaaccacttccccttccc tttccatgag ctcatggtgt 2160 gggctgttct catgaagcgg cagaagatggccctgttctt ctggcagcac ggtgaggagg 2220 ccatggccaa ggccctggtg gcctgcaagctctgcaaagc catggctcat gaggcctctg 2280 agaacgacat ggttgacgac atttcccaggagctgaatca caattccaga gactttggcc 2340 agctggctgt ggagctcctg gaccagtcctacaagcagga cgaacagctg gccatgaaac 2400 tgctgacgta tgagctgaag aactggagcaacgccacgtg cctgcagctt gccgtggctg 2460 ccaaacaccg cgacttcatc gcgcacacgtgcagccagat gctgctcacc gacatgtgga 2520 tgggccggct ccgcatgcgc aagaactcaggcctcaaggt aattctggga attctacttc 2580 ctccttcaat tctcagcttg gagttcaagaacaaagacga catgccctat atgtctcagg 2640 cccaggaaat ccacctccaa gagaaggaggcagaagaacc agagaagccc acaaaggaaa 2700 aagaggaaga ggacatggag ctcacagcaatgttgggacg aaacaacggg gagtcctcca 2760 ggaagaagga tgaagaggaa gttcagagcaagcaccggtt aatccccctc ggcagaaaaa 2820 tctatgaatt ctacaatgca cccatcgtgaagttctggtt ctacacactg gcgtatatcg 2880 gatacctgat gctcttcaac tatatcgtgttagtgaagat ggaacgctgg ccgtccaccc 2940 aggaatggat cgtaatctcc tatattttcaccctgggaat agaaaagatg agagagattc 3000 tgatgtcaga gccagggaag ttgctacagaaagtgaaggt atggctgcag gagtactgga 3060 atgtcacgga cctcatcgcc atccttctgttttctgtcgg aatgatcctt cgtctccaag 3120 accagccctt caggagtgac gggagggtcatctactgcgt gaacatcatt tactggtata 3180 tccgtctcct agacatcttc ggcgtgaacaagtatttggg cccgtatgta atgatgattg 3240 gaaaaatgat gatagacatg atgtactttgtcatcattat gctggtggtt ctgatgagct 3300 ttggggtcgc caggcaagcc atcctttttcccaatgagga gccatcatgg aaactggcca 3360 agaacatctt ctacatgccc tattggatgatttatgggga agtgtttgcg gaccagatag 3420 accctccctg tggacagaat gagacccgagaggatggtaa aataatccag ctgcctccct 3480 gcaagacagg agcttggatc gtgccggccatcatggcctg ctacctctta gtggcaaaca 3540 tcttgctggt caacctcctc attgctgtctttaacaatac attttttgaa gtaaaatcga 3600 tatccaacca agtctggaag tttcagaggtatcagctcat catgactttc catgaaaggc 3660 cagttctgcc cccaccactg atcatcttcagccacatgac catgatattc cagcacctgt 3720 gctgccgatg gaggaaacac gagagcgacccggatgaaag ggactacggc ctgaaactct 3780 tcataaccga tgatgagctc aagaaagtacatgactttga agagcaatgc atagaagaat 3840 acttcagaga aaaggatgat cggttcaactcatctaatga tgagaggata cgggtgactt 3900 cagaaagggt ggagaacatg tctatgcggctggaggaagt caacgagaga gagcactcca 3960 tgaaggcttc actccagacc gtggacatccggctggcgca gctggaagac cttatcgggc 4020 gcatggccac ggccctggag cgcctgacaggtctggagcg ggccgagtcc aacaaaatcc 4080 gctcgaggac ctcgtcagac tgcacggacgccgcctacat tgtccgtcag agcagcttca 4140 acagccagga agggaacacc ttcaagctccaagagagtat agaccctgca ggtgaggaga 4200 ccatgtcccc aacttctcca accttaatgccccgtatgcg aagccattct ttctattcag 4260 tcaatatgaa agacaaaggt ggtatagaaaagttggaaag tatttttaaa gaaaggtccc 4320 tgagcctaca ccgggctact agttcccactctgtagcaaa agaacccaaa gctcctgcag 4380 cccctgccaa caccttggcc attgttcctgattccagaag accatcatcg tgtatagaca 4440 tctatgtctc tgctatggat gagctccactgtgatataga ccctctggac aattccgtga 4500 acatccttgg gctaggcgag ccaagcttttcaactccagt accttccaca gccccttcaa 4560 gtagtgccta tgcaacactt gcacccacagacagacctcc aagccggagc attgattttg 4620 aggacatcac ctccatggac actagatctttttcttcaga ctacacccac ctcccagaat 4680 gccaaaaccc ctgggactca gagcctccgatgtaccacac cattgagcgt tccaaaagta 4740 gccgctacct agccaccaca ccctttcttctagaagaggc tcccattgtg aaatctcata 4800 gctttatgtt ttccccctca aggagctattatgccaactt tggggtgcct gtaaaaacag 4860 cagaatacac aagtattaca gactgtattgacacaaggtg tgtcaatgcc cctcaagcaa 4920 ttgcggacag agctgccttc cctggaggtcttggagacaa agtggaggac ttaacttgct 4980 gccatccaga gcgagaagca gaactgagtcaccccagctc tgacagtgag gagaatgagg 5040 ccaaaggccg cagagccacc attgcaatatcctcccagga gggtgataac tcagagagaa 5100 ccctgtccaa caacatcact gttcccaagatagagcgcgc caacagctac tcggcagagg 5160 agccaagtgc gccatatgca cacaccaggaagagcttctc catcagtgac aaactcgaca 5220 ggcagcggaa cacagcaagc ctgcgaaatcccttccagag aagcaagtcc tccaagccgg 5280 agggccgagg ggacagcctg tccatgaggaaactgtccag aacatcggct ttccaaagct 5340 ttgaaagcaa gcacacctaa accttcttaatatccgccac agaaggctca agaatccagc 5400 cctaaaattc tctccaactc cagtttttcccctttccttg aatcatacct gctttattct 5460 tagctgagca aaacaagcaa tgctttgggaggtgttaact caaaggtgac ttctgggcca 5520 cagatc 5526 31 2739 DNA Homosapiens misc_feature Incyte ID No 2493969CB1 31 gcgcagtaag tgcggactgccagccaccag ccttggcagc cagctcgtcg cctccagccc 60 cgaccccgac attcatgcccaggagaaggc tgcactgggt ccctctgggc ctttcctaaa 120 agggagatcc ctgttcactagatgagttcc agaaccatcc actaaggctt tgtagccccc 180 ttccatcagc tgaccttcactgcatcccct atcgctcaag atgagtggct tcttcacctc 240 gctggacccc cggcgggtgcagtggggagc tgcctggtat gcaatgcact ccaggatcct 300 acgcaccaaa ccagtggagtccatgctaga gggaactggg accaccacgg cacatggaac 360 taagctagcc caggtactcaccacagtgga cctcatctct cttggcgttg gcagctgtgt 420 gggcactggc atgtatgtggtctctggcct ggtggccaag gaaatggcag gacctggtgt 480 cattgtgtcc ttcatcattgcagccgtcgc atccatatta tcaggcgtct gctatgcaga 540 gtttggagtt cgagtccccaagaccacagg atctgcctac acctacagct atgtcactgt 600 tggggaattt gtggcatttttcattggctg gaacctgatc ctggagtacc tgattggcac 660 tgcggccgga gccagtgctctgagcagcat gtttgactca ctagccaacc acaccatcag 720 ccgctggatg gcggacagcgtgggaaccct caatggcctg gggaaaggtg aagaatcata 780 cccagacctt ctggctctgttgatcgcggt catcgtgacc atcattgttg ctctgggggt 840 gaagaattcc ataggcttcaacaatgttct caatgtgctg aacctggcag tatgggtgtt 900 catcatgatc gcaggcctcttcttcatcaa tgggaaatac tgggcggagg gccagttctt 960 gccccacggc tggtcaggggtgctgcaagg agcagcaaca tgcttctacg ctttcattgg 1020 ctttgacatc atcgccaccactggagagga agccaagaat cccaacacgt ccatccctta 1080 tgctatcact gcctccctggtcatctgcct gacagcatat gtgtctgtga gcgtgatctt 1140 aactctgatg gtgccatattataccattga cacggaatcc ccactcatgg agatgtttgt 1200 ggctcatggg ttctatgctgccaaattcgt agtggccatt gggtcggttg caggactgac 1260 agtcagcttg ctggggtccctcttcccgat gccgagggtc atttatgcca tggctggtga 1320 cgggctcctt ttcaggttcctggctcacgt cagctcctac acagagacac cagtggtggc 1380 ctgcatcgtg tcggggttcctggcagcgct cctcgcactg ttggtcagct tgagagacct 1440 gatagagatg atgtctatcggcacgctcct ggcctacacc ttggtctctg tctgtgtctt 1500 gctccttcga taccaacctgagagtgacat tgatggtttt gtcaagttct tgtctgagga 1560 gcacaccaag aagaaggagggcattctggc tgactgtgag aaggaagctt gttctcctgt 1620 gagtgagggg gatgagttttctggcccagc caccaacaca tgtggggcca agaacttacc 1680 atccttggga gacaatgagatgctcatagg gaaatcagac aagtcaacct acaacgtcaa 1740 ccaccccaat tacggcaccgtggacatgac cacaggcata gaagctgatg aatccgaaaa 1800 tatttatctc atcaagttaaagaagctgat tgggcctcat tattacacca tgagaatccg 1860 gctgggcctt ccaggcaaaatggaccggcc cacagcagcg acggggcaca cggtgaccat 1920 ctgcgtgctc ctgctcttcatcctcatgtt catcttctgc tccttcatca tctttggttc 1980 tgactacatc tcagagcagagctggtgggc catccttctg gttgttctga tggtgctgct 2040 gatcagcacc ctggtgtttgtgatcctgca gcagccagag aaccccaaga agctgcccta 2100 catggcccct tgcctcccctttgtgcctgc ctttgccatg ctggtgaaca tctatctcat 2160 gctaaagctc tccaccatcacatggatccg gtttgcggtc tggtgctttg tgggtctgct 2220 catttatttt ggatatggcatctggaacag caccctggaa atcagcgctc gagaagaggc 2280 cctgcaccaa agcacgtaccaacgctacga cgtggatgac cccttctcag tggaggaggg 2340 tttctcctac gccacagagggcgagagcca ggaggactgg ggcgggccca ctgaagacaa 2400 aggcttctat taccaacagatgtcagatgc gaaggcaaac ggccggacaa gtagcaaagc 2460 gaagagcaaa agcaaacacaaacagaactc agaggccctg attgcaaatg atgagttaga 2520 ttactctcca gagtaggagaaacacacaag tgggtagaaa tggtgatgac tgattttcag 2580 taacttaacc tgtgggctagaaggtgaaaa cttttttggc tctcatttca caaatccagc 2640 cttccccaaa ttcaatccctagtcatagcc tgtcatttgc tacttttgct cttcaggata 2700 gttctgttga agggcttaacctgggtcccc taactggtc 2739 32 4321 DNA Homo sapiens misc_feature IncyteID No 3244593CB1 32 atggtgggtg aaggacccta ccttatctca gatctggaccagcgaggccg gcggagatcc 60 tttgcagaaa gatatgaccc cagcctgaag accatgatcccagtgcgacc ctgtgcaagg 120 ttagcaccca acccggtgga tgatgccggg ctactctccttcgccacatt ttcctggctc 180 acgccggtga tggtgaaagg ctaccggcaa aggctgaccgtagacaccct gcccccattg 240 tcgacatatg actcatctga caccaatgcc aaaagatttcgagtcctttg ggatgaagag 300 gtagcaaggg tgggtcctga gaaggcctct ctgagccacgtggtgtggaa attccagagg 360 acacgcgtgt tgatggacat cgtggccaac atcctgtgcatcatcatggc agccataggg 420 ccgacagttc tcattcacca aatcctccag cagactgagaggacctctgg gaaagtctgg 480 gttggcattg gactgtgcat agcccttttt gccaccgagtttaccaaagt cttcttttgg 540 gcccttgcct gggccatcaa ctaccgcacg gccatccggttgaaggtggc gctctccacc 600 ttggtttttg aaaacctagt gtccttcaag acattgacccacatctctgt tggcgaggtg 660 ctcaatatac tgtcaagtga tagctattct ttgtttgaagctgccttgtt ttgtcctttg 720 ccagccacca tcccgatcct aatggtcttt tgtgcggcgtacgccttttt cattctgggg 780 cccacagctc tcatcgggat atcagtgtat gtcatattcatacccgtcca gatgtttatg 840 gccaagctca attcagcttt ccgaaggtca gcaattttggtgacagacaa gcgagttcag 900 acaatgaatg agtttctgac ctgcatcagg ctgatcaaaatgtatgcctg ggagaaatct 960 tttaccaaca ctatccaaga tataagaagg agggaaagaaaattactgga aaaagctgga 1020 tttgtccaaa gtggaaactc tgccctggcc cccatcgtgtccaccatagc catcgtgctg 1080 acattatcct gccacatcct cctgagacgc aaactcaccgcacccgtggc atttagtgtg 1140 attgccatgt ttaatgtaat gaagttttcc attgcaatcttgcccttctc catcaaagca 1200 atggctgaag cgaatgtctc tctaaggaga atgaagaaaattctcataga taaaagcccc 1260 ccatcttaca tcacccaacc agaagaccca gatactgtcttgcttttagc aaatgccacc 1320 ttgacatggg agcatgaagc cagcaggaaa agtaccccaaagaaattgca gaaccagaaa 1380 aggcatttat gcaagaaaca gaggtcagag gcatacagtgagaggagtcc accagccaag 1440 ggagccactg gcccagagga gcaaagtgac agcctcaaatcggttctgca cagcataagc 1500 tttgtggtga gaaaggggaa gatcttggga atatgtgggaatgtgggaag tggaaagagc 1560 tccctccttg cagctctcct aggacagatg cagctgcagaaaggggtggt ggcagtcaat 1620 ggaactttgg cctacgtttc acagcaggca tggatctttcatggaaatgt gagagaaaac 1680 atactctttg gagaaaagta tgatcaccaa aggtatcagcacacagtccg cgtctgtggc 1740 ctccagaagg acctgagcaa cctcccctat ggagacctgactgagattgg ggagcggggc 1800 ctcaacctct ctggggggca gaggcagagg attagcctggcccgcgctgt ctactccgac 1860 cgtcagctct acctgctgga cgaccccctg tcggccgtggacgcccacgt ggggaagcac 1920 gtctttgagg agtgcattaa gaagacgctc aggggaaagacagtcgtcct ggtgacccac 1980 cagctacagt tcttagagtc ttgtgatgaa gttattttattagaagatgg agagatttgt 2040 gaaaagggaa cccacaagga gttaatggag gagagagggcgctatgcaaa actgattcac 2100 aacctgcgag gattgcagtt caaggatcct gaacacctttacaatgcagc aatggtggaa 2160 gccttcaagg agagccctgc tgagagagag gaagatgctggtataatcgt tttggctcca 2220 ggaaatgaga aagatgaagg aaaagaatct gaaacaggctcagaatttgt agacacaaaa 2280 gggtacctcc tttctctctt cactgtgttc ctcttcctcctgatgattgg cagcgctgcc 2340 ttcagcaact ggtggctggg tctctggttg gacaagggctcacggatgac ctgtgggccc 2400 cagggcaaca ggaccatgtg tgaggtcggc gcggtgctggcagacatcgg tcagcatgtg 2460 taccagtggg tgtacactgc aagcatggtg ttcatgctggtgtttggcgt caccaaaggc 2520 ttcgtcttca ccaagaccac actgatggca tcctcctctctgcatgacac ggtgtttgat 2580 aagatcttaa agagcccaat gagtttcttt gacacgactcccactggcag gctaatgaac 2640 cgtttttcca aggatatgga cgagctggat gtgaggctgccgtttcacgc agagaacttt 2700 ctgcagcagt tttttatggt ggtgtttatt ctcgtgatcttggctgctgt gtttcctgct 2760 gtccttttag tcgtggccag ccttgctgta ggcttcttcattctgttacg cattttccac 2820 agaggagtcc aggagctcaa gaaggtggag aatgtcagccggtcaccctg gttcacccac 2880 atcacctcct ccatgcaggg cctgggcatc attcacgcctatggcaagaa ggagagctgc 2940 atcacctatc acctcctcta ctttaactgt gctctcaggtggtttgcgct gagaatggat 3000 gtcctcatga acatccttac cttcactgtg gccttgttggtgaccctgag tttctcctcc 3060 atcagtactt catccaaagg cctgtcattg tcatacatcatccagctgag cggactgctc 3120 caagtgtgtg tgcgaacggg aacagagacg caagccaaattcacctccgt ggagctgctc 3180 agggaataca tttcgacctg tgttcctgaa tgcactcatcccctcaaagt ggggacctgt 3240 cccaaggact ggcccagctg tggggagatc accttcagagactatcagat gagatacaga 3300 gacaacaccc cccttgttct cgacagcctg aacttgaacatacaaagtgg gcagacagtc 3360 gggattgttg gaagaacagg ttccggaaag tcatcgttaggaatggcttt gtttcgtctg 3420 gtggagccag ccagtggcac aatctttatt gatgaggtggatatctgcat tctcagcttg 3480 gaagacctca gaaccaagct gactgtgatc ccacaggatcctgtcctgtt tgtaggtaca 3540 gtaaggtaca acttggatcc ctttgagagt cacaccgatgagatgctctg gcaggttctg 3600 gagagaacat tcatgagaga cacaataatg aaactcccagaaaaattaca ggcagaagtc 3660 acagaaaatg gagaaaactt ctcagtaggg gaacgtcagctgctttgtgt ggcccgagct 3720 cttctccgta attcaaagat cattctcctt gatgaagccaccgcctctat ggactccaag 3780 actgacaccc tggttcagaa caccatcaaa gatgccttcaagggctgcac tgtgctgacc 3840 atcgcccacc gcctcaacac agttctcaac tgcgatcacgtcctggttat ggaaaatggg 3900 aaggtgattg agtttgacaa gcctgaagtc cttgcagagaagccagattc tgcatttgcg 3960 atgttactag cagcagaagt cagattgtag aggtcctggcggctgattct agaggaggaa 4020 gaggctctgt gagatgaata ggaggagtct tcaggaggaggggctgtcct ctccgcaggc 4080 agccctggtc ttcagcccct cccatccacg gagtgagctggggctgaagt tgtccccact 4140 gccatactca gtccatgtca ccccacttgg tgggcttggggttggttctg ggtggtgaac 4200 cggggcagac ccagctaatg gattaaaaaa ctgcccttcacctcccaaat ccccaagggt 4260 tcctcatgtg ttttcaccaa aaccacccca gtgcctgagattgaaaatat tgtaactttc 4320 a 4321 33 4519 DNA Homo sapiens misc_featureIncyte ID No 4921451CB1 33 ttcaggaccg ttggcaccgg gctaacggtt ccaccacgtccgccgccctg gacgcccgcg 60 gcctgccccc ccctgcctct cctgcgccga tacacttcgagtggattctg gccatttgag 120 cattctctcc aactctccaa tccccagtct gcccccacgggggtctcccc cacctctccc 180 ccgtcccaca gcctaaaccc ctcttcgccc tgaacctcccttttcctcat gcggtgaatg 240 ggcactggcc ccgctcagac tcccaggagc accagagctggccctgagcc aagccctgcc 300 ccaccaggac ctggggacac gggtgactca gacgtgactcaggaaggctc aggtcctgct 360 ggcatccgcg gagccccacc agcatgggca gcctcggccagagagaagat ctccgagatg 420 aggacaggaa ctcaggtgct gatcctgggc ggagggggcggtgcagcatt cacctggaag 480 gtccaggcca acaaccgtgc ctacaacggg cagttcaaggagaaggtgat cctgtgctgg 540 caaaggaaga aatacaagac caatgtcatc cgcacggccaagtacaactt ctactcgttc 600 ctgccgctga acctgtacga gcagttccac cgcgtgtccaacctgttctt cctcatcatc 660 atcatcctgc agagcattcc cgacatctcc acgctgccctggttctcgct cagtacccct 720 atggtctgcc tcctcttcat ccgtgccacc cgggacctggtggacgacat ggggagacac 780 aagagtgaca gagccatcaa caacagaccc tgccagattctgatggggaa gagcttcaag 840 cagaagaaat ggcaggatct gtgcgtgggg gatgtggtctgtctccgcaa ggacaacatc 900 gtcccagtga gctggggtgg accccgaggt cccagaaccacgcgccccct caccgagagc 960 acccctccca gggtggggag ggctgccgca cccccaatttgtcttgcatc ccctcttgca 1020 acgctgcccc ccactccaca ccaggccgac atgctcttgctggccagcac ggagcccagc 1080 agcctgtgct atgtggagac ggtggacatt gacggggagaccaacttgaa gttcagacag 1140 gccctgatgg tcacccacaa agaactggcc actataaagaagatggcgtc ctttcaaggc 1200 acagtgacgt gtgaggcgcc taacagtcgg atgcaccacttcgtggggtg cctggaatgg 1260 aatgacaaga aatactccct ggacattggc aacctcctcctccgaggctg caggattcgc 1320 aacacagaca cctgctatgg actggtcatt tatgctggttttgacacaaa aattatgaag 1380 aactgtggca agatccattt gaagagaacc aagctggacctcctgatgaa caagctggtg 1440 gttgtgatct tcatctccgt ggtgcttgtc tgcctggtgttggccttcgg cttcggtttc 1500 tcagtcaaag aattcaaaga ccaccactac tacctctcgggggtgcatgg gagcagcgtg 1560 gccgcagagt ccttcttcgt cttctggagc ttcctcatcctgctcagcgt caccatcccg 1620 atgtccatgt tcatcctgtc cgagttcatc tacctggggaacagcgtctt catcgactgg 1680 gacgtgcaga tgtactacaa gccgcaggac gtgcctgccaaggcccgcag caccagcctc 1740 aacgaccacc tgggccaggt ggaatacatc ttctcggacaagacgggcac gctcacgcag 1800 aacatcttga ccttcaacaa gtgctgcatc agcggccgcgtctatggaga acccctacct 1860 ctggaacaag ttcgccgacg ggaagctgct cttccacaatgcggccctgc tgcacctcgt 1920 gcggaccaac ggggacgagg ccgtgcggga gttctggcgcctgctggcca tctgccacac 1980 ggtgatgacc agctgttgta ccaggcggcc tcccccgacgagggggcgct ggtcaccgca 2040 gcccggaact tcggctacgt gttcctgtcc cgcacccaggacaccgtcac gatcatggag 2100 ctgggggagg aacgggtcta ccaggtcctg gccataatggacttcaacag cacgcgcaaa 2160 cggatgtcgg tgctggttcg aaagccagag ggcgccatctgcctgtacac caagggcgcc 2220 gacacggtca tcttcgaacg cttgcacagg aggggggcaatggaatttgc cacagaggag 2280 gccttggctg cctttgccca ggagaccctg cggacactgtgcctggccta cagggaggtg 2340 gctgaggaca tttacgagga ctggcagcag cgccaccaggaggccagcct cctgctgcag 2400 aaccgggcac aggccctgca acaggtgtac aacgagatggagcaggacct caggctgctg 2460 ggagccacag ccatcgagga cagactccag gacggtgtccctgaaaccat caaatgtctc 2520 aagaagagca acatcaaaat atgggtgctc accggggacaagcaggaaac ggctgtgaac 2580 atcggcttcg cctgcgagct gctgtcagag aatatgctcattctggagga gaaggagatt 2640 agccgcatcc tggagaccta ctgggaaaac agtaacaaccttctaaccag ggagtccctg 2700 tcgcaggtca agctggcctt ggtcattaac ggagacttcctggacaaact gctggtgtcc 2760 ctgcggaagg agccgcgcgc cctggcgcag aacgtgaacatggacgaggc gtggcaggag 2820 ctcggccagt ccaggaggga tttcctctac gccaggcgcctgtccctgct gtgccggagg 2880 ttcgggctcc cgctggctgc accgccagcc caggactccagagcccgccg tagctccgag 2940 gtgctgcagg agcgcgcctt cgtggacctg gcgtccaagtgccaggcggt catctgctgc 3000 cgcgtgacgc ccaagcagaa ggccctgatc gtggccctggtcaagaagta ccaccaggtg 3060 gtgaccctgg ccatcgggga cggtgccaac gacatcaacatgatcaagac cgcggacgtg 3120 ggcgtggggc tggcgggcca ggagggcatg caggcagttcagaacagcga cttcgtgctc 3180 ggccagttct gcttcctgca gcgcctcctg ctggtgcacggccgctggtc ctacgtgcgg 3240 atctgcaagt tcctgcgcta cttcttctac aagagcatggccagcatgat ggtgcaggtc 3300 tggtttgcct gctacaacgg cttcaccggc caggacgtgagcgcagagca gagcctggag 3360 aagccggagc tgtacgtggt ggggcagaag gacgagctcttcaactactg ggtcttcgtc 3420 caagccatcg cccatggtgt gaccacctct ctggtcaacttcttcatgac actgtggatc 3480 agccgcgaca cggcgggacc cgccagcttc agcgaccaccagtcctttgc ggtcgtggtg 3540 gccctgtctt gcctgctgtc catcaccatg gaggtcattcttatcatcaa gtactggacc 3600 gccctgtgcg tggcgaccat cctcctcagc cttggtttctacgccatcat gactaccacc 3660 acccagagct tctggctctt cagagtatcc cccacgaccttcccgtttct gtatgccgac 3720 ctcagcgtga tgtcctctcc ctccatcctg ctggtggtcctgctgagtgt gtccataaac 3780 accttccctg tcctggccct ccgagtcatc ttcccagccctcaaggagct acgtgccaag 3840 gaggagaagg tggaggaggg ccccagcgag gagattttcaccatggagcc cttgcctcat 3900 gtacaccggg agtctcgtgc ccgccgttcc agctatgctttctcccaccg ccagctgacg 3960 ttggagagcc agccagactc ctcggaggag aagtcagcatttttgaagcc ctccacaccg 4020 ttccggaaga gctggcaaaa ggagcctcac acccccaaggaggggacggt gccacttcca 4080 gacaagaccc acaaatctca ggtggagact ctgccaccaagtctggaaga atcgtccacg 4140 tccacgagcg agcagcctat ggaggtggag ctgtggcccgcggagaagca gtcatcatca 4200 tccatggagt ggctgctggt gcccggggag gagcagctatccttgccccc agaggagcag 4260 tcattgccct ctgcggaggg gaccagggtt cagcagtgacgtagcatctg aatccctaga 4320 cccatctgat gaagaggcat cttcgagccc aaaggagtcacgctggcata tcaggaagat 4380 gtccttcctg ggaagaagaa gctccagcca gttctgctgcaagtcaacca gcatgcaggg 4440 ggccttcctc taaagacaag gactccacat gcttttctttttctaataaa ccagggtcca 4500 tctgacccca gcgctaaaa 4519 34 2922 DNA Homosapiens misc_feature Incyte ID No 5547443CB1 34 gaggagtctg gcatggctcatgaatcagca gaggacttgt ttcatttcaa cgtagggggc 60 tggcatttct cagttcccagaagcaaactc tctcagtttc cagactccct gctgtggaaa 120 gaggcttcag ccttgacctcttcagaaagc cagaggctat ttatcgacag agatggttcc 180 acatttaggc acgtgcactattacctctac acctccaaac tctccttctc cagttgtgca 240 gaactgaact tgctgtatgagcaagcattg ggtttgcagc tgatgccttt gctgcagact 300 ctagataacc tgaaggaagggaaacaccat ctacgcgtac ggcctgcaga cctacctgtt 360 gctgagagag catctctgaactactggcgt acatggaagt gtattagcaa accctcagaa 420 tttccaatta aaagcccagcctttacaggc ctacatgata aggcacctct ggggctcatg 480 gacacacccc tgttagacacagaagaggag gtgcactact gcttcctgcc cctagacctg 540 gtggccaaat atcccagcctagtgactgaa gacaacctgc tgtggctggc tgagacggtg 600 gccctcatcg agtgcgagtgcagcgagttc cgcttcattg tgaattttct tcgctcacag 660 aagattttac taccggataatttctccaac attgatgtat tagaagcaga agtggaaatt 720 ctggaaatcc ctgcactcactgaagccgta aggtggtacc ggatgaacat gggtggctgt 780 tccccgacca cctgttctcccctgagcccc gggaaggggg cccgcacagc cagcctggag 840 tccgtgaaac cgctctacacaatggccctg ggtctgctgg tcaagtaccc ggactctgcg 900 ctgggccagc ttcgcatcgagagcacgcta gacggaagcc gactgtacat cacagggaat 960 ggcgtcctct ttcagcacgtcaagaactgg ctggggactt gccggctgcc cctgacagag 1020 accatttccg aggtatatgagctctgtgcc ttcctagaca aaagggacat cacctacgag 1080 ccaatcaaag ttgctttgaagactcatctg gagccaagga ctttggcacc catggatgtg 1140 ctcaatgagt ggacggcagagatcactgtg tattccccac aacagatcat caaagtgtat 1200 gttggaagcc actggtacgcaaccaccctg cagacactgc tgaagtatcc agaactgctg 1260 tccaaccctc agagagtgtactggatcaca tatggacaaa ccctgctcat ccacggggat 1320 ggccagatgt tccgacacattctcaacttc ctgagacttg gcaaactgtt tttaccatct 1380 gaatttaagg aatggcccctcttctgccag gaggtggagg aataccacat tccatccctc 1440 tcagaagccc ttgcacaatgtgaagcatac aagtcatgga ctcaggagaa agaatctgaa 1500 aatgaagaag ctttttccatcaggaggctg catgtggtga cagaagggcc agggtcactg 1560 gtggagttca gtagagacactaaagaaacc acagcctaca tgcctgtgga cttcgaagac 1620 tgcagtgaca ggactccatggaacaaggct aagggaaacc tggtcaggtc caaccagatg 1680 gatgaggctg agcagtacactcggcccatc caggtgtccc tatgccgaaa tgccaagagg 1740 gctggcaacc ctagcacatactcacactgc cgtggcttgt gtaccaatcc tggacactgg 1800 gggagccacc ctgagagccccccaaagaag aaatgcacca caatcaacct cacacagaaa 1860 tctgaaacca aagaccctcccgccactccc atgcaaaaac tcatctccct ggtgagagaa 1920 tgggacatgg tcaattgcaaacagtgggaa ttccagccac tgacagccac acggagcagc 1980 cccttggagg aggccaccctgcagctcccc ttgggaagcg aggctgcttc ccagcccagc 2040 acctcagctg cctggaaagcccattccaca gcctcagaga aggatccagg accacaggca 2100 ggggctggag ctggagcgaaagacaagggg ccagagccaa ccttcaagcc atacttaccc 2160 ccaaaaagag ctggcaccctgaaggactgg agcaagcaga ggaccaagga gagagaaagc 2220 cctgcccctg agcagcctctgcccgaggcc agtgaggtgg acagcctagg ggttatcctc 2280 aaagtgactc acccccccgtggtgggcagc gatggcttct gcatgttctt tgaggacagc 2340 atcatctata ccacggagatggacaacctc aggcacacaa cacccacagc cagtccccag 2400 ccccaagaag tgactttcctgagtttctct ctgtcctggg aagagatgtt ttatgcacag 2460 aaatgtcact gcttcctggctgacatcatc atggattcca tcaggcaaaa ggaccccaaa 2520 gccatcacag ccaaggtggtctccctggcc aatcggctgt ggaccctgca catcagcccc 2580 aagcagtttg tggtagatttgctggccatc accggcttca aggatgaccg gcacacccag 2640 gagcgcctgt acagctgggtggagcttaca ctgcccttcg ccaggaaata tggccgatgc 2700 atggacctgc tcatccagaggggcctgtct aggtctgtct cttactccat cctgggaaag 2760 tacctacaag aggactagggtgcccagaga tgcagcccct catgccccac ccgccaagtc 2820 tcattttaat tggagatagcccagaatgca tgtgcccatc agagggtaca tatcagtcta 2880 ttttttaata taaacaaataaaagattaaa tcacacatca aa 2922 35 2763 DNA Homo sapiens misc_featureIncyte ID No 56008413CB1 35 ggaccccagg ccgggccggg ccgagaggct gccatgggctccgtggggag ccagcgcctt 60 gaggagccca gcgtggcagg cacaccagac ccgggcgtagtgatgagctt caccttcgac 120 agtcaccagc tggaggaggc ggcggaggcg gctcagggccagggccttag ggccaggggc 180 gtcccagctt tcacggatac tacattggac gagccagtgcccgatgaccg ttatcacgcc 240 atctactttg cgatgctgct ggctggcgtg ggcttcctgctgccatacaa cagcttcatc 300 acggacgtgg actacctgca tcacaagtac ccagggacctccatcgtgtt tgacatgagc 360 ctcacctaca tcttggtggc actggcagct gtcctcctgaacaacgtcct ggtggagaga 420 ctgaccctgc acaccaggat caccgcaggc tacctcttagccttgggccc tctccttttt 480 atcagcatct gcgacgtgtg gctgcagctc ttctctcgggaccaggccta cgccatcaac 540 ctggccgctg tgggcaccgt ggccttcggc tgcacagtgcagcaatccag cttctacggg 600 tacacgggga tgctgcccaa gcggtacacg cagggggtgatgaccgggga gagcacggcg 660 ggcgtgatga tctctctgag ccgcatcctc acgaagctgctgctgcccga cgagcgcgcc 720 agcacgctca tcttcttcct ggtgtcggtg gcgctggagctgctgtgttt cctgctgcac 780 ctgttagtgc ggcgcagccg cttcgtgctc ttctataccacacggccgcg tgacagccac 840 cggggcaggc caggcctggg caggggctat ggctaccgcgtgcaccacga cgttgtcgcc 900 ggggacgtcc acttcgagca cccagccccg gccctggcccccaacgagtc cccaaaggac 960 agcccagccc acgaggtgac cggcagcggc ggggcctacatgcgctttga cgtgccgcgg 1020 ccaagggtcc agcgcagctg gcccaccttc agagccctgttactgcaccg ctacgtggtg 1080 gcgcgggtga tctgggccga catgctctcc atcgccgtgacctacttcat cacgctgtgc 1140 ctgttccccg gcctcgagtc tgagatccgc cactgcatcctgggcgagtg gctgcccatc 1200 ctcatcatgg ctgtgttcaa cctgtcagac ttcgtgggcaagatcctggc agccctgccc 1260 gtggactggc ggggcaccca cctgctggcc tgctcctgcctgcgtgtggt cttcatcccc 1320 ctcttcatcc tgtgcgtcta ccccagcggc atgcccgccctccgtcaccc cgcctggccc 1380 tgcatcttct cactgctcat gggcatcagc aacggctacttcggcagcgt gcccatgatc 1440 ctggcggcag gcaaagtgag ccccaagcag cgggagctggcagggaacac catgaccgtg 1500 tcctacatgt cagggctgac gctggggtcc gccgtggcctactgcaccta cagcctcacc 1560 cgcgacgctc acggcagctg cctgcacgcc tccaccgccaatggttccat cctcgcaggc 1620 ctctgagcca gccccgccca ctgccaggga cgccgagggcctgaccaggg gccccgaggc 1680 ctgagggccc ctcccctgtc cccacctcag tgcctgcggggccctgagcc tccccctgtg 1740 ccagcagccc cactccctca gggtccagcc atgccccaccctggactgaa gttctgcaaa 1800 gtcctccgag gaccggaaca cgtttctgcg acccggggctctggccagca ctgtgttctg 1860 cgtttggtct catacctgcg tctaccttcc atctgtgtccagcggccccg gctccagccc 1920 agccagcact ctgcagggtc acacgcaccg tgtccccacccaggacagca gacacccgcc 1980 agagtgtgcg cgcccagtga ctgcaccccg gccctcatcacccaccggca ctgatcgggg 2040 caccgcctgg cccagcctcc accagggacc cctcctcatgaactctggag ccctgagagg 2100 agaggggcag ccccccacct tgtcaccctc agggcttccccttctgtcct cattcttaga 2160 gactgcttct cccaaacata acgcgttagc catgaaggagtcggagccct gggtccgaat 2220 ggacccgcct gcggtctgca tcagcctctg ggaaaccacagcagtgatgc cagctgggca 2280 cgtcaggacc tccccacaca cccacacgat gccacaggtcagggggctgt gcctgactag 2340 ggagccctcc cattgccttc ctggcccggg atagaagaggggaggtaagt ctgggggcta 2400 cgaagccggg cccccacacc ctggctgaag tcagcttgacctaggtcttg accctcatcc 2460 agcaagggac tcgacagacc caagggtccc tggaacgtagggaggggctg ggggtcactc 2520 cagcccgggc ctcccagaac accaggcccg tgtgggtggcaccctgaggt caggggatcc 2580 taagggtgtc cttccagaga cggtgtttcc agggggaggaccgcccccgc ttccagatcc 2640 ccggccccgg ctgtgactgc cctgtttcac ccctgctgtgtcccatcccc cgtctgtcca 2700 ctaactgtac cgcaccggcc atttaaagat gaaggcagaccgctgccaaa aaaaaaaaaa 2760 aaa 2763 36 5211 DNA Homo sapiensmisc_feature Incyte ID No 6127911CB1 36 aagagctgct ggagtaggca cccatttaaagaaaaaatga agaagcagca ataaagaagt 60 tgtaatcgtt acctagacaa acagagaactggttttgaca gtgtttctag agtgcttttt 120 attattttcc tgacagttgt gttccaccatgattactttc tccttcagcg aataggctaa 180 atgaatatga aacagaaaag cgtgtatcagcaaaccaaag cacttctgtg caagaatttt 240 cttaagaaat ggaggatgaa aagagagagcttattggaat ggggcctctc aatacttcta 300 ggactgtgta ttgctctgtt ttccagttccatgagaaatg tccagtttcc tggaatggct 360 cctcagaatc tgggaagggt agataaatttaatagctctt ctttaatggt tgtgtataca 420 ccaatatcta atttaaccca gcagataatgaataaaacag cacttgctcc tcttttgaaa 480 ggaacaagtg tcattggggc accaaataaaacacacatgg acgaaatact tctggaaaat 540 ttaccatatg ctatgggaat catctttaatgaaactttct cttataagtt aatatttttc 600 cagggatata acagtccact ttggaaagaagatttctcag ctcattgctg ggatggatat 660 ggtgagtttt catgtacatt gaccaaatactggaatagag gatttgtggc tttacaaaca 720 gctattaata ctgccattat agaaatcacaaccaatcacc ctgtgatgga ggagttgatg 780 tcagttactg ctataactat gaagacattacctttcataa ctaaaaatct tcttcacaat 840 gagatgttta ttttattctt cttgcttcatttctccccac ttgtatattt tatatcactc 900 aatgtaacaa aagagagaaa aaagtctaagaatttgatga aaatgatggg tctccaagat 960 tcagcattct ggctctcctg gggtctaatctatgctggct tcatctttat tatttccata 1020 ttcattacaa ttatcataac attcacccaaattatagtca tgactggctt catggtcata 1080 tttatactct tttttttata tggcttatctttggtagctt tggtgttcct gatgagtgtg 1140 ctgttaaaga aagctgtcct caccaatttggttgtgtttc tccttaccct cttttgggga 1200 tgtctgggat tcactgtatt ttatgaacaacttccttcat ctctggagtg gattttgaat 1260 atttgtagcc cttttgcctt tactactggaatgattcaga ttatcaaact ggattataac 1320 ttgaatggtg taatttttcc tgacccttcaggagactcat acacaatgat agcaactttt 1380 tctatgttgc ttttggatgg tctcatctacttgctattgg cattatactt tgacaaaatt 1440 ttaccctatg gagatgagcg ccattattctcctttatttt tcttgaattc atcatcttgt 1500 ttccaacacc aaaggactaa tgctaaggttattgagaaag aaatcgatgc tgagcatccc 1560 tctgatgatt attttgaacc agtagctcctgaattccaag gaaaagaagc catcagaatc 1620 agaaatgtta agaaggaata taaaggaaaatctggaaaag tggaagcatt gaaaggcttg 1680 ctctttgaca tatatgaagg tcaaatcacggcaatcctgg gtcacagtgg agctggcaaa 1740 tcttcactgc taaatattct taatggattgtctgttccaa cagaaggatc agttaccatc 1800 tataataaaa atctctctga aatgcaagacttggaggaaa tcagaaagat aactggcgtc 1860 tgtcctcaat tcaatgttca atttgacatactcaccgtga aggaaaacct cagcctgttt 1920 gctaaaataa aagggattca tctaaaggaagtggaacaag aggtacaacg aatattattg 1980 gaattggaca tgcaaaacat tcaagataaccttgctaaac atttaagtga aggacagaaa 2040 agaaagctga cttttgggat taccattttaggagatcctc aaattttgct tttagatgaa 2100 ccaactactg gattggatcc cttttccagagatcaagtgt ggagcctcct gagagagcgt 2160 agagcagatc atgtgatcct tttcagtacccagtccatgg atgaggctga catcctggct 2220 gatagaaaag tgatcatgtc caatgggagactgaagtgtg caggttcttc tatgtttttg 2280 aaaagaaggt ggggtcttgg atatcacctaagtttacata ggaatgaaat atgtaaccca 2340 gaacaaataa catccttcat tactcatcacatccccgatg ctaaattaaa aacagaaaac 2400 aaagaaaagc ttgtatatac tttgccactggaaaggacaa atacatttcc agatcttttc 2460 agtgatctgg ataagtgttc tgaccagggagtgacaggtt atgacatttc catgtcaact 2520 ctaaatgaag tctttatgaa actggaaggacagtcaacta tcgaacaaga tttcgaacaa 2580 gtggagatga taagagactc agaaagcctcaatgaaatgg agctggctca ctcttccttc 2640 tctgaaatgc agacagctgt gagtgacatgggcctctgga gaatgcaagt ctttgccatg 2700 gcacggctcc gtttcttaaa gttaaaacgtcaaactaaag tgttattgac cctattattg 2760 gtatttggaa tcgcaatatt ccctttgattgttgaaaata taatatatgc tatgttaaat 2820 gaaaagatcg attgggaatt taaaaacgaattgtattttc tctctcctgg acaacttccc 2880 caggaacccc gtaccagcct gttgatcatcaataacacag aatcaaatat tgaagatttt 2940 ataaaatcac tgaagcatca aaatatacttttggaagtag atgactttga aaacagaaat 3000 ggtactgatg gcctctcata caatggagctatcatagttt ctggtaaaca aaaggattat 3060 agattttcag ttgtgtgtaa taccaagagattgcactgtt ttccaattct tatgaatatt 3120 atcagcaatg ggctacttca aatgtttaatcacacacaac atattcgaat tgagtcaagc 3180 ccatttcctc ttagccacat aggactctggactgggttgc cggatggttc ctttttctta 3240 tttttggttc tatgtagcat ttctccttatatcaccatgg gcagcatcag tgattacaag 3300 aaaaatgcta agtcccagct atggatttcaggcctctaca cttctgctta ctggtgtggg 3360 caggcactag tggacgtcag cttcttcattttaattctcc ttttaatgta tttaattttc 3420 tacatagaaa acatgcagta ccttcttattacaagccaaa ttgtgtttgc tttggttata 3480 gttactcctg gttatgcagc ttctcttgtcttcttcatat atatgatatc atttattttt 3540 cgcaaaagga gaaaaaacag tggcctttggtcattttact tcttttttgc ctccaccatc 3600 atgttttcca tcactttaat caatcattttgacctaagta tattgattac caccatggta 3660 ttggttcctt catatacctt gcttggatttaaaacttttt tggaagtgag agaccaggag 3720 cactacagag aatttccaga ggcaaattttgaattgagtg ccactgattt tctagtctgc 3780 ttcataccct actttcagac tttgctattcgtttttgttc taagatgcat ggaactaaaa 3840 tgtggaaaga aaagaatgcg aaaagatcctgttttcagaa tttcccccca aagtagagat 3900 gctaagccaa atccagaaga acccatagatgaagatgaag atattcaaac agaaagaata 3960 agaacagcca ctgctctgac cacttcaatcttagatgaga aacctgttat aattgccagc 4020 tgtctacaca aagaatatgc aggccagaagaaaagttgct tttcaaagag gaagaagaaa 4080 atagcagcaa gaaatatctc tttctgtgttcaagaaggtg aaattttggg attgctagga 4140 cccagtggtg ctggaaaaag ttcatctattagaatgatat ctgggatcac aaagccaact 4200 gctggagagg tggaactgaa aggctgcagttcagttttgg gccacctggg gtactgccct 4260 caagagaacg tgctgtggcc catgctgacgttgagggaac acctggaggt gtatgctgcc 4320 gtcaaggggc tcaggaaagc ggacgcgaggctcgccatcg caagattagt gagtgctttc 4380 aaactgcatg agcagctgaa tgttcctgtgcagaaattaa cagcaggaat cacgagaaag 4440 ttgtgttttg tgctgagcct cctgggaaactcacctgtct tgctcctgga tgaaccatct 4500 acgggcatag accccacagg gcagcagcaaatgtggcagg caatccaggc agtcgttaaa 4560 aacacagaga gaggtgtcct cctgaccacccataacctgg ctgaggcgga agccttgtgt 4620 gaccgtgtgg ccatcatggt gtctggaaggcttagatgca ttggctccat ccaacacctg 4680 aaaaacaaac ttggcaagga ttacattctagagctaaaag tgaaggaaac gtctcaagtg 4740 actttggtcc acactgagat tctgaagcttttcccacagg ctgcagggca ggaaaggtat 4800 tcctctttgt taacctataa gctgcccgtggcagacgttt accctctatc acagaccttt 4860 cacaaattag aagcagtgaa gcataactttaacctggaag aatacagcct ttctcagtgc 4920 acactggaga aggtattctt agagctttctaaagaacagg aagtaggaaa ttttgatgaa 4980 gaaattgata caacaatgag atggaaactcctccctcatt cagatgaacc ttaaaacctc 5040 aaacctagta attttttgtt gatctcctataaacttatgt tttatgtaat aattaatagt 5100 atgtttaatt ttaaagatca tttaaaattaacatcaggta tattttgtaa atttagttaa 5160 caaatacata aattttaaaa ttattcttcctctcaacata ggggtgatag c 5211 37 5701 DNA Homo sapiens misc_featureIncyte ID No 6427133CB1 37 gctcccaagg ctgagattac tctgcttcat ctggatcgcccatctctggg gtctcatggc 60 tgagtttcag ttccccaatc ctacctgctc ctcagggggccagcactggg gctgcaggta 120 ggccacctgt tgagacctgg tgaaagatca ggtataataatgttctgcag tgaaaagaaa 180 ttgcgtgaag tggaacggat agtgaaagcc aatgaccgtgaatataatga aaagttccag 240 tatgcggata atcgtatcca cacatcgaaa tataatattctcaccttctt gccaattaat 300 ttatttgaac agttccaaag agtggcaaat gcctattttctttgccttct gattttacag 360 ctaattccag aaatttcctc cttgacctgg tttaccaccattgtgccttt ggtcctggtg 420 ataactatga cagctgtcaa agatgccaca gatgactattttcgccacaa gagtgataat 480 caagtgaata atcggcagtc tgaagtgctc atcaacagcaaactgcagaa tgaaaaatgg 540 atgaatgtca aagtgggaga catcattaaa ttagaaaataaccaatttgt tgctgctgat 600 ttacttctcc tatcaagtag tgagccacat ggtctctgttatgttgaaac tgctgagctt 660 gatggggaaa cgaacctaaa agtccgccat gcactatcagttacttcaga acttggagca 720 gatatcagca gacttgcagg gtttgatggg attgttgtctgtgaggtgcc taacaacaag 780 ttagataaat tcatgggaat cctttcttgg aaagacagcaagcattccct caacaatgag 840 aagataatcc cgagaggctg catcctgaga aataccagctggtgttttgg aatggttatt 900 tttgcaggtc ctgacactaa actaatgcag aatagtggtaagacaaagtt taaaaggaca 960 agcattgata gattgatgaa tactctagta ctatggatttttgggtttct gatatgcttg 1020 ggaattattc ttgcaatagg aaattcaatc tgggagagtcaaactgggga ccaattcaga 1080 actttcctct tttggaatga aggagagaag agctctgtgttctccggatt cttaacattc 1140 tggtcatata ttattattct caatacagtt gtacccatttccttatatgt gagtgtggaa 1200 gtaattcgtc taggacacag ttattttata aactgggaccggaagatgta ttattctcga 1260 aaagcaatac ctgcagtggc tcgaacgacc acgctcaatgaggaactggg gcagattgag 1320 tacattttct ccgacaaaac gggtaccctc actcaaaacatcatgacctt taaaagatgt 1380 tccattaatg ggagaatcta tggtgaagta catgatgacctggatcagaa gacagaaata 1440 actcaggaaa aagagcctgt ggatttctca gtcaaatctcaagcggatag agaatttcag 1500 ttctttgacc acaatctgat ggaatccatt aaaatgggtgatcccaaagt tcatgaattc 1560 cttaggttac ttgctctctg ccacactgta atgtcagaagagaatagcgc aggagagctg 1620 atttaccaag ttcagtcacc tgatgaaggg gctctagtgactgccgctag aaattttggg 1680 ttcattttta aatcccggac cccagagacc ataacaatagaagaattggg aacactagtt 1740 acttatcaat tacttgcctt tttggatttc aacaacaccagaaaaaggat gtctgtcata 1800 gttcgaaacc cagaaggaca gataaagctt tattccaaaggagcagatac tattctgttt 1860 gaaaaacttc atccttccaa tgaagtcctt ttgtctttgacgtcagacca cctcagtgaa 1920 tttgcagggg aaggccttcg gaccttggcc atcgcatacagagacctgga tgacaagtac 1980 tttaaagagt ggcataagat gcttgaagat gcgaatgctgccacagaaga gagggatgaa 2040 cgaatagctg ggctatatga agaaattgaa agagatttgatgctactagg tgccactgct 2100 gtagaagata agttacagga gggtgttatt gaaacagttacaagtttatc actagccaat 2160 attaagatct gggtcctaac aggagacaaa caagaaactgccatcaacat cggttatgcc 2220 tgcaacatgc tgactgacga catgaatgat gtgtttgtgatagcagggaa taatgctgtg 2280 gaagtgagag aagaactcag gaaagcaaaa caaaatttgtttggacaaaa cagaaatttt 2340 tccaatggcc atgtagtttg tgaaaaaaag cagcagctggagttggattc tattgtagaa 2400 gaaaccataa caggagatta tgccttaatc ataaatggccacagtttggc tcatgcccta 2460 gaaagtgatg tcaagaatga tctcctagaa cttgcttgcatgtgtaagac tgtaatttgc 2520 tgcagggtca ctccactcca gaaagcccaa gtggtagagctggtgaagaa gtacagaaat 2580 gctgttactt tggccattgg tgatggagcc aatgatgtcagcatgattaa aagtgctcac 2640 attggtgttg gcatcagcgg ccaggaagga ttgcaagcagtcttagccag cgactattca 2700 tttgcacagt ttagatatct ccaaaggctt ctccttgttcatggaaggtg gtcttatttc 2760 cgaatgtgca aattcttatg ctatttcttc tataagaattttgcatttac acttgtgcat 2820 ttctggtttg gtttcttctg tggtttctca gcccagactgtttatgacca gtggttcatc 2880 acccttttta acattgttta cacatcactg cctgttttagccatggggat ttttgaccag 2940 gatgtgagtg accagaacag cgtggactgt ccccagctctacaaaccagg acagctgaat 3000 ctgcttttta acaagcgtaa atttttcatt tgcgtgttgcatggaatcta cacctcatta 3060 gtccttttct tcatccccta tggggccttt tacaacgtggctggagaaga tgggcaacat 3120 attgctgact accagtcctt tgcagttacc atggccacatctttggtcat tgtggtcagt 3180 gtgcagatag ccttggatac cagttactgg actttcattaatcacgtctt catctggggg 3240 agcattgcca tttatttctc cattttattt acaatgcacagtaatggcat ctttggcatc 3300 ttcccaaacc agtttccatt tgttggtaat gcacgacattccctgaccca gaagtgcatc 3360 tggcttgtaa ttctcttaac aacagtggct tcagttatgccagtggtggc attcagattt 3420 ttgaaggtgg atttataccc aaccctgagt gatcagatccgccggtggca gaaggctcaa 3480 aagaaggcaa ggcctccaag tagccgaagg cctcggacccgcaggtcaag ctcaagaagg 3540 tctggatatg cttttgctca ccaagaaggc tatggagagcttatcacatc tggaaaaaat 3600 atgcgagcta aaaatccacc cccaacatca gggctggaaaagacacatta taatagcact 3660 agctggattg aaaatttatg taagaaaacc acagacaccgtgagcagctt tagccaggat 3720 aaaacagtga aactgtgagt caatatgaat ttaaaccacgtagttatctt ttcacttcag 3780 gtggagctga aattctgctg gctccagagt ttgagatttgaggcaagagg tggggcaggc 3840 agattgcctc acttaactta aatctgcggc agacaactgccagtgcccat caaacaggag 3900 tgtgcgctat ggaaaaccag gccagagggt cactgtctggtttgtgattt ggtggacaaa 3960 acactcgctg ttacaagtac agattttttt tttttttaaatcaacctaga taccaattga 4020 cctgaacttt agaatcttat ttatggagaa aaacttgtaaagctgcatat tcactgaatg 4080 gatcctcagg cggataaaag ggtgcatttt aaaggtatatatccaagctg aaaagcatgc 4140 ctattgacag ataaacatgt atctgtaaga tcagcctttcccaaggtata cttttaaaat 4200 ttaaagcgtg tactgtgttg ctttcagact gagttgcatgtcactcttta gtcttgatat 4260 ctacctgtct gttcagccag gacaacaaat ggcttccaagcctgaagaat acaaaagtgt 4320 gcttgtgttt ctcattttta taccagtcta gggacaaaggagactgaaca tctttgcagc 4380 aggataggct ggtaatttga tcaaatttat tcaaaaagctctcagtctgt gtcatgtaag 4440 gacatgctta tgaaatgtga gagaggctcg ccactaagtattctaaatac ttttcaatgg 4500 cttttctaac aacctcagta gtaatttgct gagcatcatccagaccatta atagaatcag 4560 caaagcactg gaatttcaca ctttaatgat aatattccacatagtctatg ggcaaatatt 4620 ttcaacattt ccaattttta aagcttcaga attgaagccaaacaaattaa taaataattg 4680 ttttaattac tatttaaaaa ctcaggttta gattgtttaaaattagttgc ttttgatact 4740 cagctgtcat gtttataatt caaacatgta gtaaacatatgtaggtaagg ttgttttttt 4800 ggagatgttg cagctcaaat ttcagtccac atatgaatcatcagtgtatt ttccataaag 4860 tgattcgggc atatttgtgt gaaaacctca gttctgtcacttcttacctc tataaacttg 4920 gacgataatg tgccttctct gagactcagt ttcttcctctgtaaaatgag gacatactac 4980 ctacctcatg tggttggttg atgattgtct gtcaaagcacaaactctgaa attattaaaa 5040 acataattat ttcataaaca gatgagttaa gttccagttaactcaacatc agtataacag 5100 agcaattgga agagaatatg aaaaaactgg aatctaaatagtcagtgagg aaggctttga 5160 taaaatgaaa ttgccagaaa gatataaaac tggttagggtcctacaggga aataaaatta 5220 taaccgtgga ggtacatttc tctaccagaa agcaaaaataaagcatcatg tcttaatggt 5280 tttctacaaa tcaacttcta attctacaga gtccttaatctggtccctat taaattcttg 5340 gtcagacaaa gttacatttc ccaagagagt caggtgacacttgagtgagt ttgatggata 5400 atgagctaat gtgatatcta taggtcacaa ttttttaaaaccaaaatttt caagtctggg 5460 ataatctttc ctaaatggga tcaaatgaaa taatatgtgtaaaagagtca aatgcagtcc 5520 tttaccatag taactgccta tggacgttgt ctttcccttacatgcctgcc tacacttaac 5580 cagatgttgg ttttcaatgt ctaatttgtc attagtttcaccacatttgc tcactttttg 5640 taacattttt gcaagatttg aaaactttca gtaaatgttttggcactatt ggtaaaaaaa 5700 a 5701 38 1990 DNA Homo sapiens misc_featureIncyte ID No 7472932CB1 38 atggctcatg ccccagaacc agacccggcc gccagcgacctcggggatga gaggcccaag 60 tgggacaaca aggcccagta cctcctgagc tgcatcgggtttgccgtggg gctggggaac 120 atttggcggt tcccatacct gtgccagacc tatggaggaggtgccttcct catcccctac 180 gtcatcgcgc tggtcttcga ggggatcccc attttccacgtcgagctcgc catcggccag 240 cggctgcgga agggcagcgt cggcgtgtgg acggccatctccccgtacct cagtggagta 300 ggtctgggct gtgtcacgct gtccttcctg atcagcctgtactacaacac catcgtggcg 360 tgggtgctgt ggtacctcct caactccttc cagcacccgctgccctggag ctcctgccca 420 ccggacctca acagaacagg ttttgtggag gagtgccagggcagcagcgc cgtgagctac 480 ttctggtacc ggcagacact gaacatcaca gccgacatcaatgacagtgg ctccatccag 540 tggtggctgc tcatctgctt ggcagcctcc tgggcagtcgtgtacatgtg tgtcatcagg 600 ggcattgaga ctacagggaa ggtgatttac ttcacagctttgttccctta cctggtcctg 660 accatctttc tcatcagagg gctgaccctg ccaggggcaacaaaaggact catctacttg 720 ttcactccca acatgcacat tctccagaac ccccgggtgtggctggacgc agccacccag 780 atattcttct ctctgtccct ggccttcgga ggacacatcgcttttgcaag ttacaactcg 840 cccaggaatg actgccagaa ggatgcggtg gtcatcgccctggtcaacag gatgacctcc 900 ctgtacgcgt ccatcgctgt cttctctgtc ctggggttcaaagcaactaa tgactgtccc 960 cgcagaaaca tcctcagcct catcaacgac tttgacttcccagagcagag catctccagg 1020 gacgactacc cagccgtcct catgcacctg aacgccacctggcccaagag ggtggcccag 1080 ctccccctga aggcctgcct cctggaagac tttctggataagagtgcctc gggcccgggc 1140 ctggccttcg tcgtcttcac ggagaccgac ctccacatgccgggggctcc tgtgtgggcc 1200 atgctcttct tcgggatgct gttcaccttg gggctatcgaccatgttcgg gaccgtggag 1260 gcggtcatca cacccctgct ggacgtgggg gtcctgcctagatgggtccc caaggaggcc 1320 ctgactgggc tggtctgcct ggtctgcttc ctctccgccacctgcttcac gctgcagtct 1380 gggaactact ggctggagat tttcgacaat tttgccgcttccctgaacct gctcatgttg 1440 gcctttctcg aggttgtggg tgtcgtttat gtttatggaatgaaacggtt ctgcgatgac 1500 attgcgtgga tgaccgggag gcggcccagc ccctactggcggctgacctg gagggtggtc 1560 agtcccctgc tgctgaccat ctttgtggct tacatcatcctcctgttctg gaagccactg 1620 agatacaagg cctggaaccc caaatacgag ctgttcccctcgcgtcagga gaagctctac 1680 ccgggctggg cgcgcgccgc ctgtgtgctg ctgtccttgctgcccgtgct gtgggtcccg 1740 gtggccgcgc ttgctcagct gctcacccgg cggaggcggacgtggaggga cagggacgcg 1800 cgcccagaca cggacatgcg cccggacacg gacacgcgcccagacacgga catgcgcccg 1860 gacacggaca tgcgctgaag ccggccggag cggggcctgcatgggcgggt ctgtgggggg 1920 gcttggcctg atggtgggcg gggccccgcc cacagggccgaccccaatac accagcgact 1980 caaccttgaa 1990 39 3760 DNA Homo sapiensmisc_feature Incyte ID No 8463147CB1 39 atgacacagg catatcagaa atatattctagaaaagttac ctaaaagccc tggagacaaa 60 ggcagagcat ggcctgggtc aactccatctgggaatttgc tgtccccatt catggcagct 120 tctaactcct ttcctgagct gtgtagccaggtttccagaa gagagtactg ggacctgcat 180 ggaataccgt ctgaccactt ttctgtgagggtacaagttg aattctatat gaatgaaaat 240 acatttaaag aaagactaac attatttttcataacaaacc agagatcaag tctaaggata 300 cgcctgttca atttttctct caaattactaagctgcttat tatacataat ccgagtacta 360 ctagaaaacc cttcacaagg aaatgaatggtctcatatct tttgggtgaa cagaagtcta 420 cctttgtggg gcttacaggt ttcagtggcattgataagtc tgtttgaaac aatattactt 480 ggttatctta gttataaggg aaacatctgggaacagattt tacgaatacc cttcatcttg 540 gaaataatta atgcagttcc cttcattatctcaatattct ggccttcctt aaggaatcta 600 tttgtcccag tctttctgaa ctgttggcttgccaaacatg ccttggaaaa tatgattaat 660 gatctacaca gagccattca gcgtacacagtgctgcaaat gtgttaatca agttttgatt 720 gtaatatcta cattactatg ccttatcttcacctgcattt gtgggatcca acatctggaa 780 cgaataggaa agaagctgaa tctctttgactccctttatt tctgcattgt gacgttttct 840 actgtgggct tcggggatgt cactcctgaaacatggtcct ccaagctttt tgtagttgct 900 atgatttgtg ttgctcttgt ggttctacccatacagtttg aacagctggc ttatttgtgg 960 atggagagac aaaagtcagg aggaaactatagtcgacata gagctcaaac tgaaaagcat 1020 gtcgtcctgt gtgtcagctc actgaagattgatttactta tggatttttt aaatgaattc 1080 tatgctcatc ctaggctcca ggattattatgtggtgattt tgtgtcctac tgaaatggat 1140 gtacaggttc gaagggtact gcagattccaatgtggtccc aacgagttat ctaccttcaa 1200 ggttcagccc ttaaagatca agacctattgagagcaaaga tggatgacgc tgaggcctgt 1260 tttattctca gtagccgttg tgaagtggataggacatcat ctgatcacca aacaattttg 1320 agagcatggg ctgtgaaaga ttttgctccaaattgtcctt tgtatgtcca gatattaaag 1380 cctgaaaata aatttcacat caaatttgctgatcatgttg tttgtgaaga agagtttaaa 1440 tacgccatgt tagctttaaa ctgtatatgcccagcaacat ctacacttat tacactactg 1500 gttcatacct ctagagggca gtgtgtgtgcctgtgttgca gagaaggcca gcaatcgcca 1560 gaacaatggc agaagatgta cggtagatgctccgggaatg aagtctacca cattgttttg 1620 gaagaaagta cattttttgc tgaatatgaaggaaagagtt ttacatatgc ctctttccat 1680 gcacacaaaa agtttggcgt ctgcttgattggtgttagga gggaggataa taaaaacatt 1740 ttgctgaatc caggtcctcg atacattatgaattctacag acatatgctt ttatattaat 1800 attaccaaag aagagaattc agcatttaaaaaccaagacc agcagagaaa aagcaatgtg 1860 tccaggtcgt tttatcatgg accttccagattacctgtac atagcataat tgccagcatg 1920 ggtactgtgg ctatagactt gcaagatacaagctgtagat cagcaagtgg ccctaccctg 1980 tctcttccta cagagggaag caaagaaataagaagaccta gcattgctcc tgttttagag 2040 gttgcagata catcatcgat tcaaacatgtgatcttctaa gtgaccaatc agaagatgaa 2100 actacaccag atgaagaaat gtcttcaaacttagagtatg ctaaaggtta cccaccttat 2160 tctccatata taggaagttc acccactttttgtcatctcc ttcatgaaaa agtaccattt 2220 tgctgcttaa gattagacaa gagttgccaacataactact atgaggatgc aaaagcctat 2280 ggattcaaaa ataaactaat tatagttgcagctgaaacag ctggaaatgg attatataac 2340 tttattgttc ctctcagggc atattatagaccaaagaaag aacttaatcc catagtactg 2400 ctattggata acccgccaga tatgcattttctggatgcaa tctgttggtt tccaatggtt 2460 tactacatgg tgggctctat tgacaacctagatgacttac tcaggtgtgg agtgactttt 2520 gctgctaata tggtggttgt ggataaagagagcaccatga gtgccgagga agactacatg 2580 gcagatgcca aaaccattgt gaacgtgcagacactcttca ggttgttttc cagtctcagt 2640 attatcacag agctaactca ccccgccaacatgagattca tgcaattcag agccaaagac 2700 tgttactctc ttgctctttc aaaactggaaaagaaagaac gggagagagg ctctaacttg 2760 gcctttatgt ttcgactgcc ttttgctgctgggagggtgt ttagcatcag tatgttggac 2820 actctgctgt atcagtcatt tgtgaaggattatatgattt ctatcacgag acttctgttg 2880 ggactggaca ctacaccagg atctgggtttctttgttcta tgaaaatcac tgcagatgac 2940 ttatggatca gaacttatgc cagactttatcagaagttgt gttcttctac tggagatgtt 3000 cccattggaa tctacaggac tgagtctcagaaacttacta catctgagtc tcaaatatct 3060 atcagtgtag aagagtggga agacaccaaagactccaaag aacaagggca ccaccgcagc 3120 aaccaccgca actcaacatc cagtgaccagtcggaccatc ccttgctgcg gagaaaaagc 3180 atgcagtggg cccgaagact gagcagaaaaggcccaaaac actctggtaa aacagctgaa 3240 aaaataaccc agcagcgact gaacctctacaggaggtcag aaagacaaga gcttgctgaa 3300 cttgtgaaaa atagaatgaa acacttgggtctttctacag tgggatatga tgaaatgaat 3360 gatcatcaaa gtaccctctc ctacatcctgattaacccat ctccagatac cagaatagag 3420 ctgaatgatg ttgtatactt aattcgaccagatccactgg cctaccttcc aaacagtgag 3480 cccagtcgaa gaaacagcat ctgcaatgtcactggtcaag attctcggga ggaaactcaa 3540 ctttgataaa aataaaatga gaaacttttttcctacaaag accttgcttg aaaccacaaa 3600 agttttgctg gcacgaaaga aactagatggaaatatatgt aattctctca tatttaaaaa 3660 cgtaatctct tctcttagaa gtatagatcattttgaaact taatgtacta cttactggta 3720 ctctccctat taatatttga aggacctcaatggaaagcgg 3760 40 1150 DNA Homo sapiens misc_feature Incyte ID No7506408CB1 40 ccagaggaaa ctagtcacaa aaaccctgac tatcacctga tagattgcttgtgctgcctg 60 ataattactc gcacttttcc caggctagtg caaatcttca ggggccgtccaggactacag 120 agctgtttca ccctaccttg gcttcaatct cttcccccat gctcgaaggtgcggagctgt 180 acttcaacgt ggaccatggc tacctggagg gcctggttcg aggatgcaaggccagcctcc 240 tgacccagca agactatatc aacctggtcc agtgtgagac cctagaagctccattcttcc 300 aagactgcat gtctgaaaat gctctagatg aactgaatat tgaattgctacgcaataaac 360 tatacaagtc ttaccttgag gcattctata aattctgtaa gaatcatggtgatgtcacag 420 cagaagttat gtgtcccatt cttgagtttg aggccgacag acgtgcttttatcatcactc 480 ttaactcctt tggcactgaa ttgagcaaag aagaccgaga gaccctctatccaaccttcg 540 gcaaactcta tcctgagggg ttgcggctgt tggctcaagc agaagactttgaccagatga 600 agaacgtagc ggatcattac ggagtataca aacctttatt tgaagctgtaggtggcagtg 660 ggggaaagac attggaggac gtgttttacg agcgtgaggt acaaatgaatgtgctggcat 720 tcaacagaca gttccactac ggtgtgtttt atgcatatgt aaagctgaaggaacaggaaa 780 ttagaaatat tgtgtggata gcagaatgta tttcacagag gcatcgaactaaaatcaaca 840 gttacattcc aattttataa cccaagtaag gttctcaaat gtagaaaattataaatgtta 900 aaaggaagtt attgaagaaa ataaaagaaa ttatgttata ttatctagactacacaaaag 960 taagccacac tatatcttca tgagttgcaa atccatggaa acacagtaaaccagccctga 1020 aacaaagcat ttccttgttt tcagtggtat tagatcttgt ttccacatgtctgtctcatt 1080 cttcactggg ccttacaggt tagttttaat taactctatg gtatttttctattcttgtct 1140 gatcatgtta 1150

What is claimed is:
 1. An isolated polypeptide selected from the groupconsisting of: a) a polypeptide comprising an amino acid sequenceselected from the group consisting of SEQ ID NO:1-20, b) a polypeptidecomprising a naturally occurring amino acid sequence at least 90%identical to an amino acid sequence selected from the group consistingof SEQ ID NO:1-20, c) a biologically active fragment of a polypeptidehaving an amino acid sequence selected from the group consisting of SEQID NO:1-20, and d) an immunogenic fragment of a polypeptide having anamino acid sequence selected from the group consisting of SEQ IDNO:1-20.
 2. An isolated polypeptide of claim 1 comprising an amino acidsequence selected from the group consisting of SEQ ID NO:1-20.
 3. Anisolated polynucleotide encoding a polypeptide of claim
 1. 4. Anisolated polynucleotide encoding a polypeptide of claim
 2. 5. Anisolated polynucleotide of claim 4 comprising a polynucleotide sequenceselected from the group consisting of SEQ ID NO:21-40.
 6. A recombinantpolynucleotide comprising a promoter sequence operably linked to apolynucleotide of claim
 3. 7. A cell transformed with a recombinantpolynucleotide of claim
 6. 8. A transgenic organism comprising arecombinant polynucleotide of claim
 6. 9. A method of producing apolypeptide of claim 1, the method comprising: a) culturing a cell underconditions suitable for expression of the polypeptide, wherein said cellis transformed with a recombinant polynucleotide, and said recombinantpolynucleotide comprises a promoter sequence operably linked to apolynucleotide encoding the polypeptide of claim 1, and b) recoveringthe polypeptide so expressed.
 10. A method of claim 9, wherein thepolypeptide comprises an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20.
 11. An isolated antibody whichspecifically binds to a polypeptide of claim
 1. 12. An isolatedpolynucleotide selected from the group consisting of: a) apolynucleotide comprising a polynucleotide sequence selected from thegroup consisting of SEQ ID NO:21-40, b) a polynucleotide comprising anaturally occurring polynucleotide sequence at least 90% identical to apolynucleotide sequence selected from the group consisting of SEQ IDNO:21-40, c) a polynucleotide complementary to a polynucleotide of a),d) a polynucleotide complementary to a polynucleotide of b), and e) anRNA equivalent of a)-d).
 13. An isolated polynucleotide comprising atleast 60 contiguous nucleotides of a polynucleotide of claim
 12. 14. Amethod of detecting a target polynucleotide in a sample, said targetpolynucleotide having a sequence of a polynucleotide of claim 12, themethod comprising: a) hybridizing the sample with a probe comprising atleast 20 contiguous nucleotides comprising a sequence complementary tosaid target polynucleotide in the sample, and which probe specificallyhybridizes to said target polynucleotide, under conditions whereby ahybridization complex is formed between said probe and said targetpolynucleotide or fragments thereof, and b) detecting the presence orabsence of said hybridization complex, and, optionally, if present, theamount thereof.
 15. A method of claim 14, wherein the probe comprises atleast 60 contiguous nucleotides.
 16. A method of detecting a targetpolynucleotide in a sample, said target polynucleotide having a sequenceof a polynucleotide of claim 12, the method comprising: a) amplifyingsaid target polynucleotide or fragment thereof using polymerase chainreaction amplification, and b) detecting the presence or absence of saidamplified target polynucleotide or fragment thereof, and, optionally, ifpresent, the amount thereof.
 17. A composition comprising a polypeptideof claim 1 and a pharmaceutically acceptable excipient.
 18. Acomposition of claim 17, wherein the polypeptide comprises an amino acidsequence selected from the group consisting of SEQ ID NO:1-20.
 19. Amethod for treating a disease or condition associated with decreasedexpression of functional TRICH, comprising administering to a patient inneed of such treatment the composition of claim
 17. 20. A method ofscreening a compound for effectiveness as an agonist of a polypeptide ofclaim 1, the method comprising: a) exposing a sample comprising apolypeptide of claim 1 to a compound, and b) detecting agonist activityin the sample.
 21. A composition comprising an agonist compoundidentified by a method of claim 20 and a pharmaceutically acceptableexcipient.
 22. A method for treating a disease or condition associatedwith decreased expression of functional TRICH, comprising administeringto a patient in need of such treatment a composition of claim
 21. 23. Amethod of screening a compound for effectiveness as an antagonist of apolypeptide of claim 1, the method comprising: a) exposing a samplecomprising a polypeptide of claim 1 to a compound, and b) detectingantagonist activity in the sample.
 24. A composition comprising anantagonist compound identified by a method of claim 23 and apharmaceutically acceptable excipient.
 25. A method for treating adisease or condition associated with overexpression of functional TRICH,comprising administering to a patient in need of such treatment acomposition of claim
 24. 26. A method of screening for a compound thatspecifically binds to the polypeptide of claim 1, the method comprising:a) combining the polypeptide of claim 1 with at least one test compoundunder suitable conditions, and b) detecting binding of the polypeptideof claim 1 to the test compound, thereby identifying a compound thatspecifically binds to the polypeptide of claim
 1. 27. A method ofscreening for a compound that modulates the activity of the polypeptideof claim 1, the method comprising: a) combining the polypeptide of claim1 with at least one test compound under conditions permissive for theactivity of the polypeptide of claim 1, b) assessing the activity of thepolypeptide of claim 1 in the presence of the test compound, and c)comparing the activity of the polypeptide of claim 1 in the presence ofthe test compound with the activity of the polypeptide of claim 1 in theabsence of the test compound, wherein a change in the activity of thepolypeptide of claim 1 in the presence of the test compound isindicative of a compound that modulates the activity of the polypeptideof claim
 1. 28. A method of screening a compound for effectiveness inaltering expression of a target polynucleotide, wherein said targetpolynucleotide comprises a sequence of claim 5, the method comprising:a) exposing a sample comprising the target polynucleotide to a compound,under conditions suitable for the expression of the targetpolynucleotide, b) detecting altered expression of the targetpolynucleotide, and c) comparing the expression of the targetpolynucleotide in the presence of varying amounts of the compound and inthe absence of the compound.
 29. A method of assessing toxicity of atest compound, the method comprising: a) treating a biological samplecontaining nucleic acids with the test compound, b) hybridizing thenucleic acids of the treated biological sample with a probe comprisingat least 20 contiguous nucleotides of a polynucleotide of claim 12 underconditions whereby a specific hybridization complex is formed betweensaid probe and a target polynucleotide in the biological sample, saidtarget polynucleotide comprising a polynucleotide sequence of apolynucleotide of claim 12 or fragment thereof, c) quantifying theamount of hybridization complex, and d) comparing the amount ofhybridization complex in the treated biological sample with the amountof hybridization complex in an untreated biological sample, wherein adifference in the amount of hybridization complex in the treatedbiological sample is indicative of toxicity of the test compound.
 30. Adiagnostic test for a condition or disease associated with theexpression of TRICH in a biological sample, the method comprising: a)combining the biological sample with an antibody of claim 11, underconditions suitable for the antibody to bind the polypeptide and form anantibody:polypeptide complex, and b) detecting the complex, wherein thepresence of the complex correlates with the presence of the polypeptidein the biological sample.
 31. The antibody of claim 11, wherein theantibody is: a) a chimeric antibody, b) a single chain antibody, c) aFab fragment, d) a F(ab′)₂ fragment, or e) a humanized antibody.
 32. Acomposition comprising an antibody of claim 11 and an acceptableexcipient.
 33. A method of diagnosing a condition or disease associatedwith the expression of TRICH in a subject, comprising administering tosaid subject an effective amount of the composition of claim
 32. 34. Acomposition of claim 32, wherein the antibody is labeled.
 35. A methodof diagnosing a condition or disease associated with the expression ofTRICH in a subject, comprising administering to said subject aneffective amount of the composition of claim
 34. 36. A method ofpreparing a polyclonal antibody with the specificity of the antibody ofclaim 11, the method comprising: a) immunizing an animal with apolypeptide consisting of an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20, or an immunogenic fragment thereof, underconditions to elicit an antibody response, b) isolating antibodies fromsaid animal, and c) screening the isolated antibodies with thepolypeptide, thereby identifying a polyclonal antibody which bindsspecifically to a polypeptide comprising an amino acid sequence selectedfrom the group consisting of SEQ ID NO:1-20.
 37. A polyclonal antibodyproduced by a method of claim
 36. 38. A composition comprising thepolyclonal antibody of claim 37 and a suitable carrier.
 39. A method ofmaking a monoclonal antibody with the specificity of the antibody ofclaim 11, the method comprising: a) immunizing an animal with apolypeptide consisting of an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20, or an immunogenic fragment thereof, underconditions to elicit an antibody response, b) isolating antibodyproducing cells from the animal, c) fusing the antibody producing cellswith immortalized cells to form monoclonal antibody-producing hybridomacells, d) culturing the hybridoma cells, and e) isolating from theculture monoclonal antibody which binds specifically to a polypeptidecomprising an amino acid sequence selected from the group consisting ofSEQ ID NO:1-20.
 40. A monoclonal antibody produced by a method of claim39.
 41. A composition comprising the monoclonal antibody of claim 40 anda suitable carrier.
 42. The antibody of claim 11, wherein the antibodyis produced by screening a Fab expression library.
 43. The antibody ofclaim 11, wherein the antibody is produced by screening a recombinantimmunoglobulin library.
 44. A method of detecting a polypeptidecomprising an amino acid sequence selected from the group consisting ofSEQ ID NO:1-20 in a sample, the method comprising: a) incubating theantibody of claim 11 with a sample under conditions to allow specificbinding of the antibody and the polypeptide, and b) detecting specificbinding, wherein specific binding indicates the presence of apolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20 in the sample.
 45. A method of purifying apolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20 from a sample, the method comprising: a)incubating the antibody of claim 11 with a sample under conditions toallow specific binding of the antibody and the polypeptide, and b)separating the antibody from the sample and obtaining the purifiedpolypeptide comprising an amino acid sequence selected from the groupconsisting of SEQ ID NO:1-20.
 46. A microarray wherein at least oneelement of the microarray is a polynucleotide of claim
 13. 47. A methodof generating an expression profile of a sample which containspolynucleotides, the method comprising: a) labeling the polynucleotidesof the sample, b) contacting the elements of the microarray of claim 46with the labeled polynucleotides of the sample under conditions suitablefor the formation of a hybridization complex, and c) quantifying theexpression of the polynucleotides in the sample.
 48. An array comprisingdifferent nucleotide molecules affixed in distinct physical locations ona solid substrate, wherein at least one of said nucleotide moleculescomprises a first oligonucleotide or polynucleotide sequencespecifically hybridizable with at least 30 contiguous nucleotides of atarget polynucleotide, and wherein said target polynucleotide is apolynucleotide of claim
 12. 49. An array of claim 48, wherein said firstoligonucleotide or polynucleotide sequence is completely complementaryto at least 30 contiguous nucleotides of said target polynucleotide. 50.An array of claim 48, wherein said first oligonucleotide orpolynucleotide sequence is completely complementary to at least 60contiguous nucleotides of said target polynucleotide.
 51. An array ofclaim 48, wherein said first oligonucleotide or polynucleotide sequenceis completely complementary to said target polynucleotide.
 52. An arrayof claim 48, which is a microarray.
 53. An array of claim 48, furthercomprising said target polynucleotide hybridized to a nucleotidemolecule comprising said first oligonucleotide or polynucleotidesequence.
 54. An array of claim 48, wherein a linker joins at least oneof said nucleotide molecules to said solid substrate.
 55. An array ofclaim 48, wherein each distinct physical location on the substratecontains multiple nucleotide molecules, and the multiple nucleotidemolecules at any single distinct physical location have the samesequence, and each distinct physical location on the substrate containsnucleotide molecules having a sequence which differs from the sequenceof nucleotide molecules at another distinct physical location on thesubstrate.
 56. A polypeptide of claim 1, comprising the amino acidsequence of SEQ ID NO:1.
 57. A polypeptide of claim 1, comprising theamino acid sequence of SEQ ID NO:2.
 58. A polypeptide of claim 1,comprising the amino acid sequence of SEQ ID NO:3.
 59. A polypeptide ofclaim 1, comprising the amino acid sequence of SEQ ID NO:4.
 60. Apolypeptide of claim 1, comprising the amino acid sequence of SEQ IDNO:5.
 61. A polypeptide of claim 1, comprising the amino acid sequenceof SEQ ID NO:6.
 62. A polypeptide of claim 1, comprising the amino acidsequence of SEQ ID NO:7.
 63. A polypeptide of claim 1, comprising theamino acid sequence of SEQ ID NO:8.
 64. A polypeptide of claim 1,comprising the amino acid sequence of SEQ ID NO:9.
 65. A polypeptide ofclaim 1, comprising the amino acid sequence of SEQ ID NO:10.
 66. Apolypeptide of claim 1, comprising the amino acid sequence of SEQ IDNO:11.
 67. A polypeptide of claim 1, comprising the amino acid sequenceof SEQ ID NO:12.
 68. A polypeptide of claim 1, comprising the amino acidsequence of SEQ ID NO:13.
 69. A polypeptide of claim 1, comprising theamino acid sequence of SEQ ID NO:14.
 70. A polypeptide of claim 1,comprising the amino acid sequence of SEQ ID NO:15.
 71. A polypeptide ofclaim 1, comprising the amino acid sequence of SEQ ID NO:16.
 72. Apolypeptide of claim 1, comprising the amino acid sequence of SEQ IDNO:17.
 73. A polypeptide of claim 1, comprising the amino acid sequenceof SEQ ID NO:18.
 74. A polypeptide of claim 1, comprising the amino acidsequence of SEQ ID NO:19.
 75. A polypeptide of claim 1, comprising theamino acid sequence of SEQ ID NO:20.
 76. A polynucleotide of claim 12,comprising the polynucleotide sequence of SEQ ID NO:21.
 77. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:22.
 78. A polynucleotide of claim 12, comprising thepolynucleotide sequence of SEQ ID NO:23.
 79. A polynucleotide of claim12, comprising the polynucleotide sequence of SEQ ID NO:24.
 80. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:25.
 81. A polynucleotide of claim 12, comprising thepolynucleotide sequence of SEQ ID NO:26.
 82. A polynucleotide of claim12, comprising the polynucleotide sequence of SEQ ID NO:27.
 83. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:28.
 84. A polynucleotide of claim 12, comprising thepolynucleotide sequence of SEQ ID NO:29.
 85. A polynucleotide of claim12, comprising the polynucleotide sequence of SEQ ID NO:30.
 86. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:31.
 87. A polynucleotide of claim 12, comprising thepolynucleotide sequence of SEQ ID NO:32.
 88. A polynucleotide of claim12, comprising the polynucleotide sequence of SEQ ID NO:33.
 89. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:34.
 90. A polynucleotide of claim 12, comprising thepolynucleotide sequence of SEQ ID NO:35.
 91. A polynucleotide of claim12, comprising the polynucleotide sequence of SEQ ID NO:36.
 92. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:37.
 93. A polynucleotide of claim 12, comprising thepolynucleotide sequence of SEQ ID NO:38.
 94. A polynucleotide of claim12, comprising the polynucleotide sequence of SEQ ID NO:39.
 95. Apolynucleotide of claim 12, comprising the polynucleotide sequence ofSEQ ID NO:40.